Biomarker for myalgic encephalomyelitis/chronic fatigue syndrome (me/cfs)

ABSTRACT

A method for diagnosing myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) is provided in the present disclosure. The present disclosure provides a method that uses the B cell receptor (BCR) repertoire as an indicator of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS). One or more variables selected from the group consisting of the frequency of use of one or more genes in the IgGH heavy chain variable region of the BCR in a subject, the BCR diversity index in a subject, and the level of one or more immune cell subpopulations in a subject can be used as an indicator of ME/CFS in the subject.

TECHNICAL FIELD

The present disclosure relates to the field of diagnosis of diseases, especially myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS).

BACKGROUND ART

Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) is a disease with severe fatigue, exhaustion after exertion, sleep disorder, cognitive dysfunction, or orthostatic intolerance as the core symptom and involving various symptoms such as pain, autonomic neuropathy, or hypersensitivity to light, sound, food, or chemical substance. While the syndrome develops after an infection in some cases, the pathology is still not elucidated. Since there is no biomarker for identifying the development thereof, various diagnostic criteria are used, which is an impediment to diagnosis or research. Further, an effective therapeutic method has not been established.

SUMMARY OF INVENTION Solution to Problem

The present disclosure provides a method of using a B cell receptor (BCR) repertoire as an indicator of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS). It is demonstrated herein that usage frequency of a gene in an IgG H chain variable region of a BCR varies in an ME/CFS patient in comparison to a healthy control. Such information can be used to diagnose ME/CFS. In addition to a gene in an IgG H chain variable region of a BCR, the immune cell subpopulation (B cell, regulatory T cell, or the like) count in a subject can also be used as an indicator. The usage frequency of a gene can be determined by a method comprising large scale highly efficient BCR repertoire analysis.

Examples of embodiments of the present disclosure are specified in the following items.

(Item 1)

A method of using a B cell receptor (BCR) repertoire in a subject as an indicator of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in the subject.

(Item 2)

The method of the preceding item, using one or more variables comprising a usage frequency of one or more genes in an IgG H chain variable region of a BCR in a subject as an indicator of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in the subject.

(Item 2A)

The method of any of the preceding items, further characterized in that the one or more variables is an indicator of the subject suffering from ME/CFS, and not another disease.

(Item 3)

The method of any of the preceding items, wherein the one or more genes comprise at least one gene selected from the group consisting of IGHV1-2, IGHV1-3, IGHV1-8, IGHV1-18, IGHV1-24, IGHV1-38-4, IGHV1-45, IGHV1-46, IGHV1-58, IGHV1-69, IGHV1-69-2, IGHV1-69D, IGHV1/OR15-1, IGHV1/OR15-5, IGHV1/OR15-9, IGHV1/OR21-1, IGHV2-5, IGHV2-26, IGHV2-70, IGHV2-70D, IGHV2/OR16-5, IGHV3-7, IGHV3-9, IGHV3-11, IGHV3-13, IGHV3-15, IGHV3-16, IGHV3-20, IGHV3-21, IGHV3-23, IGHV3-23D, IGHV3-25, IGHV3-30, IGHV3-30-3, IGHV3-30-5, IGHV3-33, IGHV3-35, IGHV3-38, IGHV3-38-3, IGHV3-43, IGHV3-43D, IGHV3-48, IGHV3-49, IGHV3-53, IGHV3-64, IGHV3-64D, IGHV3-66, IGHV3-72, IGHV3-73, IGHV3-74, IGHV3-NL1, IGHV3/OR15-7, IGHV3/OR16-6, IGHV3/OR16-8, IGHV3/OR16-9, IGHV3/OR16-10, IGHV3/OR16-12, IGHV3/OR16-13, IGHV4-4, IGHV4-28, IGHV4-30-2, IGHV4-30-4, IGHV4-31, IGHV4-34, IGHV4-38-2, IGHV4-39, IGHV4-59, IGHV4-61, IGHV4/OR15-8, IGHV5-10-1, IGHV5-51, IGHV6-1, IGHV7-4-1, IGHV7-81, IGHD1-1, IGHD1-7, IGHD1-14, IGHD1-20, IGHD1-26, IGHD1/OR15-1a/b, IGHD2-2, IGHD2-8, IGHD2-15, IGHD2-21, IGHD2/OR15-2a/b, IGHD3-3, IGHD3-9, IGHD3-10, IGHD3-16, IGHD3-22, IGHD3/OR15-3a/b, IGHD4-4, IGHD4-11, IGHD4-17, IGHD4-23, IGHD4/OR15-4a/b, IGHD5-5, IGHD5-12, IGHD5-18, IGHD5-24, IGHD5/OR15-5a/b, IGHD6-6, IGHD6-13, IGHD6-19, IGHD6-25, IGHD7-27, IGHJ1, IGHJ2, IGHJ3, IGHJ4, IGHJ5, IGHJ6, IGHG1, IGHG2, IGHG3, IGHG4, and IGHGP.

(Item 3-1)

The method of any of the preceding items, wherein the one or more genes comprise at least one gene selected from the group consisting of IGHV3-73, IGHV1-69-2, IGHV5-51, IGHV4-31, IGHV3-23D, IGHV1/OR15-9, IGHV4-39, IGHD5-12, IGHV3-43D, IGHD4-17, IGHV5-10-1, IGHD4/OR15-4a/b, IGHG4, IGHV1/OR15-5, IGHV3/OR16-9, IGHD1-7, IGHV3-21, IGHD6-6, IGHV3-33, IGHD4-23, IGHV3-30-5, IGHV3-23, IGHD6-13, IGHV3-64D, IGHV3-48, IGHV3-64, IGHG1, IGHV3-49, IGHV3-30-3, IGHD1-26, IGHJ6, IGHV3-30, IGHGP, IGHV1-3, and IGHD3-22.

(Item 3-2)

The method of any of the preceding items, wherein the one or more genes comprise at least one gene selected from the group consisting of IGHD6-6, IGHV3-33, IGHD4-23, IGHV3-30-5, IGHV3-23, IGHD6-13, IGHV3-64D, IGHV3-48, IGHV3-64, IGHG1, IGHV3-49, IGHV3-30-3, IGHD1-26, IGHJ6, IGHV3-30, IGHGP, IGHV1-3, and IGHD3-22.

(Item 3-3)

The method of any of the preceding items, wherein the one or more genes comprise at least one gene selected from the group consisting of IGHV3-49, IGHV3-30-3, IGHD1-26, IGHJ6, IGHV3-30, IGHGP, IGHV1-3, and IGHD3-22.

(Item 3-4)

The method of any of the preceding items, wherein the one or more genes comprise at least one gene selected from the group consisting of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ6.

(Item 4-1)

The method of any of the preceding items, wherein the one or more variables further comprise one or more immune cell subpopulation counts.

(Item 4-2)

The method of any of the preceding items, wherein the one or more variables exhibit AUC 0.7 under a ROC curve in regression analysis for differentiating a normal control from ME/CFS.

(Item 4-3)

The method of any of the preceding items, wherein the one or more variables exhibit AUC 0.8 under a ROC curve in regression analysis for differentiating a normal control from ME/CFS.

(Item 4-4)

The method of any of the preceding items, wherein the immune cell subpopulation count is selected from a B cell count, a naïve B cell count, a memory B cell count, a plasmablast count, an activated naïve B cell count, a transitional B cell count, a regulatory T cell count, a memory T cell count, a follicular helper T cell count, a Tfh1 cell count, a Tfh2 cell count, a Tfh17 cell count, a Th1 cell count, a Th2 cell count, and a Th17 cell count.

(Item 5-1)

The method of any of the preceding items, wherein the one or more variables comprise usage frequencies of two or more genes in an IgG H chain variable region of a BCR in the subject.

(Item 5-2)

The method of any of the preceding items, wherein the one or more variables exhibit AUC 0.7 under a ROC curve in regression analysis for differentiating a normal control from ME/CFS.

(Item 5-2)

The method of any of the preceding items, wherein the one or more variables exhibit AUC 0.8 under a ROC curve in regression analysis for differentiating a normal control from ME/CFS.

(Item 5-2)

The method of any of the preceding items, wherein the one or more variables exhibit AUC 0.9 under a ROC curve in regression analysis for differentiating a normal control from ME/CFS.

(Item 6-1)

The method of any of the preceding items, wherein the one or more variables comprise a combination of two or more variables selected from the group consisting of a usage frequency of one or more genes in an IgG H chain variable region of a BCR in the subject, a BCR diversity index in the subject, and one or more immune cell subpopulation counts in the subject, and wherein the one or more variables exhibit AUC 0.7 under a ROC curve in regression analysis for differentiating a normal control from ME/CFS.

(Item 6-2)

The method of any of the preceding items, wherein the one or more variables comprise three of more variables selected from the group.

(Item 6-3)

The method of any of the preceding items, wherein the one or more variables comprise four of more variables selected from the group.

(Item 6-4)

The method of any of the preceding items, wherein the one or more variables comprise five of more variables selected from the group.

(Item 6-5)

The method of any of the preceding items, wherein the one or more variables comprise six of more variables selected from the group.

(Item 6-7)

The method of any of the preceding items, wherein the one or more variables exhibit AUC 0.8 under a ROC curve in regression analysis for differentiating a normal control from ME/CFS.

(Item 6-3)

The method of any of the preceding items, wherein the one or more variables exhibit AUC 0.9 under a ROC curve in regression analysis for differentiating a normal control from, ME/CFS.

(Item 7)

The method of any of the preceding items, wherein the one or more genes comprise IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ6.

(Item 8)

The method of any of the preceding items, wherein the one or more variables comprise a B cell count in the subject.

(Item 9)

The method of any of the preceding items, wherein the one or more variables comprise a regulatory T cell (Treg) count in the subject.

(Item 10)

The method of any of the preceding items, wherein the usage frequency of the one or more genes is determined by a method comprising large scale highly efficient BCR repertoire analysis.

(Item 11)

The method of any of the preceding items, wherein the one or more variables comprise a usage frequency of at least one gene selected from the group consisting of IGHV3-49, IGHV3-30-3, IGHD1-26, IGHV3-30, IGHJ6, IGHGP, IGHV4-31, IGHV3-64, IGHD3-22, IGHV3-33, IGHV3-73, IGHV5-10-1, and IGHV4-34.

(Item 12)

The method of any of the preceding items, comprising:

(a) using a part of the one or more variables as an indicator of ME/CFS in the subject; and (b) using a part of the one or more variables as an indicator of the subject having ME/CFS, and not another disease.

(Item 13)

The method of any of the preceding items, wherein (b) is performed a plurality of times for a plurality of other diseases.

(Item 14)

The method of any of the preceding items, wherein the another disease comprises multiple sclerosis (MS).

(Item 15)

A method of using one or more variables comprising a usage frequency of one or more genes in an IgG H chain variable region of a BCR in a subject as an indicator of the subject suffering from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), and not multiple sclerosis (MS).

(Item 15A)

The method of the preceding item, comprising a feature according to one or more of the preceding items.

(Item 16)

The method of any of the preceding items, wherein the one or more genes comprise at least one gene selected from the group consisting of IGHV1-2, IGHV1-3, IGHV1-8, IGHV1-18, IGHV1-24, IGHV1-38-4, IGHV1-45, IGHV1-46, IGHV1-58, IGHV1-69, IGHV1-69-2, IGHV1-69D, IGHV1/OR15-1, IGHV1/OR15-5, IGHV1/OR15-9, IGHV1/OR21-1, IGHV2-5, IGHV2-26, IGHV2-70, IGHV2-70D, IGHV2/OR16-5, IGHV3-7, IGHV3-9, IGHV3-11, IGHV3-13, IGHV3-15, IGHV3-16, IGHV3-20, IGHV3-21, IGHV3-23, IGHV3-23D, IGHV3-25, IGHV3-30, IGHV3-30-3, IGHV3-30-5, IGHV3-33, IGHV3-35, IGHV3-38, IGHV3-38-3, IGHV3-43, IGHV3-43D, IGHV3-48, IGHV3-49, IGHV3-53, IGHV3-64, IGHV3-64D, IGHV3-66, IGHV3-72, IGHV3-73, IGHV3-74, IGHV3-NL1, IGHV3/OR15-7, IGHV3/OR16-6, IGHV3/OR16-8, IGHV3/OR16-9, IGHV3/OR16-10, IGHV3/OR16-12, IGHV3/OR16-13, IGHV4-4, IGHV4-28, IGHV4-30-2, IGHV4-30-4, IGHV4-31, IGHV4-34, IGHV4-38-2, IGHV4-39, IGHV4-59, IGHV4-61, IGHV4/OR15-8, IGHV5-10-1, IGHV5-51, IGHV6-1, IGHV7-4-1, IGHV7-81, IGHD1-1, IGHD1-7, IGHD1-14, IGHD1-20, IGHD1-26, IGHD1/OR15-1a/b, IGHD2-2, IGHD2-8, IGHD2-15, IGHD2-21, IGHD2/OR15-2a/b, IGHD3-3, IGHD3-9, IGHD3-10, IGHD3-16, IGHD3-22, IGHD3/OR15-3a/b, IGHD4-4, IGHD4-11, IGHD4-17, IGHD4-23, IGHD4/OR15-4a/b, IGHD5-5, IGHD5-12, IGHD5-18, IGHD5-24, IGHD5/OR15-5a/b, IGHD6-6, IGHD6-13, IGHD6-19, IGHD6-25, IGHD7-27, IGHJ1, IGHJ2, IGHJ3, IGHJ4, IGHJ5, IGHJ6, IGHG1, IGHG2, IGHG3, IGHG4, and IGHGP.

(Item 17)

The method of any of the preceding items, wherein the one or more genes comprise at least one gene selected from the group consisting of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHD1-26, IGHV3-49, and IGHJ6.

(Item 18)

The method of any of the preceding items, wherein the one or more variables comprise usage frequencies of two or more genes in an IgG H chain variable region of a BCR in the subject.

(Item 19)

The method of any of the preceding items, wherein the one or more variables exhibit AUC 0.7 under a ROC curve in regression analysis for differentiating MS from ME/CFS.

(Item 20)

The method of any of the preceding items, wherein the one or more variables exhibit AUC 0.8 under a ROC curve in regression analysis for differentiating MS from ME/CFS.

(Item 21)

The method of any of the preceding items, wherein the one or more variables exhibit AUC ≥0.9 under a ROC curve in regression analysis for differentiating MS from ME/CFS.

(Item 22)

The method of any of the preceding items, wherein the one or more variables comprise usage frequencies of three or more genes in an IgG H chain variable region of a BCR in the subject.

(Item 23)

The method of any of the preceding items, wherein the one or more variables exhibit AUC 0.7 under a ROC curve in regression analysis for differentiating MS from ME/CFS.

(Item 24)

The method of any of the preceding items, wherein the one or more variables exhibit AUC 0.8 under a ROC curve in regression analysis for differentiating MS from ME/CFS.

(Item 25)

The method of any of the preceding items, wherein the one or more variables exhibit AUC 0.9 under a ROC curve in regression analysis for differentiating MS from ME/CFS.

(Item 26)

The method of any of the preceding items, wherein (b) is performed by the method of any of the preceding items.

(Item A1)

A method of diagnosing a subject as suffering from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), comprising: determining a B cell receptor (BCR) repertoire in a subject; and diagnosing the subject as suffering from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) based on the BCR repertoire.

(Item A1-1)

The method of the preceding item, comprising a feature according to one or more of the preceding items.

(Item A2)

A method of treating myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), comprising: determining a B cell receptor (BCR) repertoire in a subject; diagnosing the subject as suffering from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) based on the BCR repertoire; and administering therapy on the subject.

(Item A2-1)

The method of the preceding item, comprising a feature according to one or more of the preceding items.

(Item A3)

The method of any of the preceding items for diagnosing the subject as suffering from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) based on one or more variables comprising a usage frequency of one or more genes in an IgG H chain variable region of a BCR in the subject.

(Item A4)

The method of any of the preceding items, wherein a value of a formula comprising the variables is computed for the subject, and the value is compared with a threshold value to diagnose the subject as suffering from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS).

(Item A5)

The method of any of the preceding items, comprising:

(A) providing a plurality of variables comprising a usage frequency of one or more genes in an IgG H chain variable region of a BCR with regard to myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS); (B) providing a differentiation formula generated by multivariate analysis for the variables; (C) computing a probability of suffering from ME/CFS by fitting values of the variables of the subject into the differentiation formula; and (D) determining the subject as suffering from ME/CFS if the probability of suffering from ME/CFS is greater than a predetermined value.

(Item A6)

The method of any of the preceding items, comprising:

(A) providing a plurality of variables comprising a usage frequency of one or more genes in an IgG H chain variable region of a BCR with regard to myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS); (B) providing a differentiation formula generated by multivariate analysis for the variables, (B) comprising:

(B-1) performing univariate or multivariate logistic regression for the variables, with a distinction between patient/healthy individual as a response variable;

(B-2) computing a constant and a coefficient of a differentiation formula from a constant and a partial regression coefficient of a logit model formula generated in the logistic regression; and

(B-3) generating a differentiation formula based on the constant and the coefficient obtained from processing of B-2;

(C) computing a probability of suffering from ME/CFS by fitting values of the variables of the subject into the differentiation formula; and (D) determining the subject as suffering from ME/CFS if the probability of suffering from ME/CFS is greater than a predetermined value.

(Item A6-A)

The method of any of the preceding items, comprising:

(A) specifying a combination of variables, (A) comprising:

(A-1) comparing a healthy individual with a myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) patient and providing a plurality of variables comprising a usage frequency of one or more genes in an IgG H chain variable region of a BCR for which a significant difference is detected, for subjects including the ME/CFS patient and the healthy individual; and/or

(A-2) performing univariate logistic analysis using a gene in an IgG H chain variable region of the BCR as an independent variable or multivariate logistic analysis using two or more genes in an IgG H chain variable region of the BCR as independent variables to obtain a logistic regression model formula, and performing ROC analysis for measuring a degree of fit of the logistic regression model formula to select a gene exhibiting an AUC value equal to or greater than a predetermined value as a variable for a differentiation formula;

(B) providing a differentiation formula generated by multivariate analysis for the variables for the differentiation formula, (B) comprising:

(B-1) performing univariate or multivariate logistic regression for the variables for the differentiation formula, with a distinction between patient/healthy individual as a response variable;

(B-2) computing a constant and a coefficient of a differentiation formula from a constant and a partial regression coefficient of a logit model formula generated in the logistic regression of B-1; and

(B-3) generating a differentiation formula based on the constant and the coefficient obtained from processing of B-2;

(C) computing a probability of suffering from ME/CFS by fitting values of the variables of the subject into the differentiation formula; and (D) determining the subject as suffering from ME/CFS if the probability of suffering from ME/CFS is greater than a predetermined value.

(Item A7)

The method of any of the preceding items, comprising:

(A) providing a plurality of variables comprising a usage frequency of one or more genes in an IgG H chain variable region of a BCR with regard to myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS); (B) providing a differentiation formula generated by multivariate analysis for the variables; (C) computing a probability of suffering from ME/CFS by fitting values of the variables of the subject into the differentiation formula; (AA) providing a plurality of variables comprising a usage frequency of one or more genes in an IgG H chain variable region of a BCR for a disease other than myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS); (BB) providing a differentiation formula generated by multivariate analysis for the variables; (CC) computing a probability of suffering from a disease other than ME/CFS by fitting values of the variables of the subject into the differentiation formula; and (D) determining the subject as suffering from ME/CFS if the probability of suffering from, ME/CFS is greater than a predetermined value and the probability of suffering from a disease other than ME/CFS is lower than a predetermined value.

(Item A8)

An in vitro method of diagnosing a subject as suffering from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), comprising diagnosing the subject as suffering from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) based on a B cell receptor (BCR) repertoire in a subject.

(Item A8-1)

The method of the preceding item, comprising a feature according to one or more of the preceding items.

(Item B1)

A program comprising an instruction which, when executed by one or more processors, causes the processors to obtain one or more variables comprising a variable associated with a BCR repertoire of a subject and to determine whether the subject has ME/CFS based on the one or more variables.

(Item B1-1)

The program of the preceding item, comprising a feature according to one or more of the preceding items.

(Item B2)

A storage medium for recording a program comprising an instruction which, when executed by one or more processors, causes the processors to obtain one or more variables comprising a variable associated with a BCR repertoire of a subject and to determine whether the subject has ME/CFS based on the one or more variables.

(Item B2-1)

The storage medium of the preceding item, comprising a feature according to one or more of the preceding items.

(Item B2-2)

The storage medium of any of the preceding items, which is a non-transitory storage medium.

(Item C1)

A system comprising a recording unit configured to record information on one or more variables comprising a variable associated with a BCR repertoire of a subject and a determination unit configured to obtain the information and determine whether the subject has ME/CFS

(Item C1-1)

The system of preceding item, comprising a feature according to one or more of the preceding items.

Advantageous Effects of Invention

Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) can be accurately diagnosed with the present disclosure. The present disclosure can also be utilized in prediction of development of ME/CFS and prognostic diagnosis thereof. The present disclosure can also be utilized as a marker in the development of a therapeutic agent for ME/CFS.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a diagram showing a result of comparing usage frequencies of each IGHV gene (IGHV family) of a BCR obtained from samples of ME/CFS patients and healthy controls. The vertical axis indicates the usage frequency (%) of each gene, and the bar indicates the standard error. HC: healthy control, ME/CFS: ME/CFS patient.

FIG. 2 is a diagram showing a result of comparing usage frequencies of each of the IGHD and IGHJ families of a BCR obtained from samples of ME/CFS patients and healthy controls. The vertical axis indicates the usage frequency (%) of each gene, and the bar indicates the standard error. HC: healthy control, ME/CFS: ME/CFS patient.

FIG. 3A is a dot plot of the usage frequency of the described IGHV gene of samples of ME/CFS patients and healthy controls. HC: healthy control, ME/CFS: ME/CFS patient.

FIG. 3A is a dot plot of the usage frequency of the described IGHV gene of samples of ME/CFS patients and healthy controls. HC: healthy control, ME/CFS: ME/CFS patient.

FIG. 4 is a dot plot of the described B cell subpopulation count of samples of ME/CFS patients and healthy controls. HC: healthy control, ME/CFS: ME/CFS patient.

FIG. 5 is a dot plot of the regulatory T cell population count of samples of ME/CFS patients and healthy controls. HC: healthy control, ME/CFS: ME/CFS patient.

FIG. 6 is a dot plot of the described T cell population count of samples of ME/CFS patients and healthy controls. HC: healthy control, ME/CFS: ME/CFS patient.

FIG. 7 is a diagram showing the usage frequencies of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ6 and an ROC curve in regression analysis when B cell count (5) is used as a variable.

FIG. 8 is a diagram showing the usage frequencies of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ6 and an ROC curve in regression analysis when Treg count (5) is used as a variable.

FIG. 9 is a diagram showing the difference in usage frequencies of IGH genes between ME/CFS patients (n=37) and MS patients (n=10). ME/CFS: ME/CFS patient, MS: MS patient.

FIG. 10 is a diagram showing the difference in diversity indices between ME/CFS patients (n=37) and MS patients (n=10). ME/CFS: ME/CFS patient, MS: MS patient. A significant difference was not detected between ME/CFS patients (n=37) and MS patients (n=10) for any of the diversity indices.

FIG. 11 is a diagram showing the difference in usage frequencies of IGH genes between ME/CFS patients (n=37) and non-ME/CFS patients (n=33). ME/CFS: ME/CFS patient, non-ME/CFS: healthy controls MS patients.

DESCRIPTION OF EMBODIMENTS

The present disclosure is explained hereinafter while showing the best mode of the disclosure. Throughout the entire specification, a singular expression should be understood as encompassing the concept thereof in the plural form, unless specifically noted otherwise. Thus, singular articles (e.g., “a”, “an”, “the”, and the like in the case of English) should also be understood as encompassing the concept thereof in the plural form, unless specifically noted otherwise. Further, the terms used herein should be understood as being used in the meaning that is commonly used in the art, unless specifically noted otherwise. Thus, unless defined otherwise, all terminologies and scientific technical terms that are used herein have the same meaning as the general understanding of those skilled in the art to which the present disclosure pertains. In case of a contradiction, the present specification (including the definitions) takes precedence.

The definitions of the terms and/or the detailed basic technology that are particularly used herein are described hereinafter as appropriate.

(ME/CFS)

Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) is a severe chronic disease with unexplained high level of fatigue, inordinate exhaustion after exertion, sleep disorder, or cognitive dysfunction that lasts 6 months or longer as the core symptom and often involving orthostatic intolerance, pain, digestive symptom, hypersensitivity to light, sound, odor, or chemical substance, or the like. While the number of patients is estimated to be 100000 or more domestically, the pathology is still not elucidated with no disease-specific testing method or effective therapeutic method, so that suitable medical treatment is currently unavailable.

In view of the recent report from Norway of the effectiveness of B cell depletion therapy using rituximab, the immunological pathology of ME/CFS has drawn global attention. Among the acquired immune systems, B cells are responsible for antibody production, but B cell receptors are very diverse due to gene rearrangement in order to respond to diverse antigens. A specific B cell receptor (clone) is elevated in autoimmune diseases such as systemic erythematodes or hematological tumor of B cell system and is detected as a bias in the entire collection of diverse B cell receptors of an individual (repertoire). Use of a next generation sequencer can enable repertoire analysis with a method eliminating biases and is potentially useful in the detection of abnormalities in a B cell system of ME/CFS.

ME/CFS is diagnosed from a symptom/medical interview in accordance with criteria such as Fukuda criteria, Canadian criteria, or International consensus criteria. A biomarker indicating the disease has not been developed yet for ME/CFS. There are a plurality of diagnostic criteria for diagnosis using a combination of symptoms. The Fukuda criteria is the oldest (1994), and then the Canadian criteria (2003) and subsequently the International criteria (2011) were created. In the meantime, however, the common understanding of experts on the fundamental symptom of the disease (core symptom) has advanced, such that creation of new diagnostic criteria is currently being considered. The reasons why multiple diagnostic criteria including old criteria co-exist are understood to be the need for both 1) detailed/strict criteria (research criteria) for promoting research and advancing understanding of the pathology of the disease/development of a therapeutic method, and 2) simple criteria (diagnostic criteria) for a physician to render a diagnosis and promote medical/social intervention to a patient. Specifically, the Fukuda criteria is still considered essential today for research as the research criteria, while newer criteria are meant for diagnosis. Such a double standard is generally unacceptable for other diseases, but is recognized/accepted among experts as an exception for ME/CFS. The diagnostic criteria prepared by the Japanese Ministry of Health, Labour and Welfare are based on international criteria that are adjusted for domestic practice (PS setting and the like). As used herein, ME-CFS can refer to a disease specified by research criteria, the Fukuda criteria, but can also reference diseases defined by other diagnostic criteria. ME/CFS can be confirmed herein by any of the diagnostic criteria accepted in the art.

In the present disclosure, a subject shown to have ME/CFS, subject shown to possibly have ME/CFS, a subject shown to have the possibility of developing ME/CFS, or a subject shown to be suffering from ME/CFS with poor prognosis can be suitably treated. ME/CFS treatment can used an immunomodulator (rituximab or the like), nonsteroidal anti-inflammatory agent, antidepressant, anxiolytic, or the like.

(BCR Repertoire)

In one embodiment, the present disclosure provides a method of using a B cell receptor (BCR) repertoire as an indicator of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in a subject. As used herein, “B cell receptor (BCR)” is also called a B cell antigen receptor, referring to those comprised of Igα/Igβ (CD79a/CD79b) heterodimer (α/β) associated with a membrane bound immunoglobulin (mIg) molecule. An mIg subunit binds to an antigen to induce aggregation of receptors, while an α/β subunit transmits a signal toward the cell. Aggregation of BCRs is understood to quickly activate Lyn, Blk, and Fyn of an Src family kinase in the same manner as Syk and Btk of tyrosine kinase. Many different results are produced depending on the complexity of BCR signaling. Examples thereof include survival, resistance (anergy; lack of hypersensitive reaction to an antigen) or apoptosis, cell division, differentiation into an antibody producing cell or memory B cell, and the like. Hundreds of millions of types of T cells with different TCR variable region sequences are produced, and hundreds of millions of types of B cells with different BCR (or antibody) variable region sequences are produced. Since the individual sequences of TCRs and BCRs vary due to rearrangement or mutation of the genomic sequence, a clue for antigen specificity of a T cell or B cell can be found by determining the TCR/BCR genomic sequence or the mRNA (cDNA) sequence.

As used herein, “V region” refers to a variable (V) region of a variable region of a TCR chain or BCR chain.

As used herein, “D region” refers to a D region of a variable region of a TCR chain or BCR chain.

As used herein, “J region” refers to a J region of a variable region of a TCR chain or BCR chain.

As used herein, “C region” refers to a constant (C) region of a TCR chain or BCR chain.

As used herein, “repertoire of a variable region” refers to a collection of V(D)J regions optionally created by gene rearrangement in TCR or BCR. The phrases TCR repertoire, BCR repertoire and the like are used, but they can also be called, for example, a T cell repertoire, B cell repertoire, or the like. A repertoire can be considered as having two aspects, i.e., information on the degree of diversity as a whole and information on how frequently each gene is used. One embodiment provides a method of using one or more variables comprising a usage frequency of one or more genes in an IgG H chain variable region of a BCR in a subject as an indicator of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in the subject. The usage frequency of one or more genes in an IgG H chain variable region of a BCR can be derived as follows.

An example of a method of determining a BCR repertoire is a method of analyzing the ratio of T cells expressing individual V3 chains by flow cytometry using a specific V3 chain specific antibody for how much of individual V chains is used by a B cell in a sample (FACS analysis). TCR repertoire analysis through a molecular biological approach has been conceived based on information on a TCR gene obtained from, a human genome sequence. This includes a method of extracting RNA from a cell sample and synthesizing a complementary DNA, and then subjecting a TCR gene to PCR amplification for quantification.

A nucleic acid can be extracted from a cell sample by using a tool that is known in the art such as RNeasy Plus Universal Mini Kit (QIAGEN). Total RNA can be extracted and purified from a cell dissolved in a TRIzol LS reagent by using an RNeasy Plus Universal Mini Kit (QIAGEN). A complementary DNA can be synthesized from an extracted RNA by using any reverse transcriptase known in the art such as Superscript III™ (Invitrogen).

Those skilled in the art can appropriately perform PCR amplification of a BCR gene using any polymerase known in the art. However, an “unbiased” amplification of a gene with large variation such as a BCR gene can result in an advantageous effect for accurate measurement.

A method of designing numerous individual BCR V chain specific primers as primers used for PCR amplification and quantifying each separately by real-time PCR or the like, or a method of concurrently amplifying such specific primers (Multiple PCR) has been used. However, even for quantification of each V chain using an endogenous control, an accurate analysis cannot be performed if many primers are used. Furthermore, Multiple PCR has a disadvantage in that a difference in amplification efficiencies among primers leads to a bias during PCR amplification.

A preferred embodiment of the present disclosure determines BCR diversity by amplifying, without changing the frequency of presence, BCR genes including all isotype and subtype genes with one set of primers consisting of one type of forward primer and one type of reverse primer as described in WO 2015/075939 (Repertoire Genesis Inc.). The following primer design is advantageous for unbiased amplification.

Focus was placed on the genetic structure of a BCR gene. An adaptor sequence is added, without setting a primer to a V region with a high degree of diversity, to the 5′ terminus thereof to amplify all V region containing genes. Such an adaptor can have any length or sequence in a base sequence. About 20 base pairs are optimal, but a sequence from 10 bases to 100 bases can be used. An adaptor added to the 3′ terminus is removed with a restriction enzyme, and all BCR genes are amplified by amplification with a reverse primer specific to a C region which has a common sequence with an adaptor primer with the same sequence as a 20 base pair adaptor.

A complementary strand DNA is synthesized with a reverse transcriptase from a BCR gene messenger RNA and then a double stranded complementary DNA is synthesized. A double stranded complementary DNA comprising V regions with different lengths is synthesized by a reverse transcription reaction or a double strand synthesizing reaction. Adaptors consisting of 20 base pairs and 10 base pairs are added to the 5′ terminal section of such genes by a DNA ligase reaction.

The genes can be amplified by setting a reverse primer to a C region of a heavy chain, i.e., μchain, α chain, δ chain, γ chain, or ε chain, or a light chain, i.e., κ chain or A chain of BCRs. As reverse primers set in a C region, primers are set which match the sequences of each of Cμ, Cα, Cδ, Cγ, Cε, Cκ, and Cλ and have a mismatch to an extent that other C region sequences are not primed for BCRs. A reverse primer of a C region is optimally produced while considering the base sequence, base composition, DNA melting temperature (Tm), or presence/absence of a self-complementary sequence, so that amplification with an adaptor primer is possible. A primer can be set in a region other than the base sequence that is different among allelic sequences in a C region sequence to uniformly amplify all alleles. Nested PCR with a plurality of stages is performed in order to enhance the specificity of an amplification reaction.

While the length (number of bases) of a primer candidate sequence is not particularly limited for a sequence not comprising a sequence that is different among allelic sequences for each primer, the number of bases is 10 to 100, preferably 15 to 50, and more preferably 20 to 30. Use of such unbiased amplification is advantageous and preferred for identification of a low frequency (1/10,000 to 1/100,000 or less) gene.

A BCR repertoire can be determined from read data that is obtained by sequencing a BCR gene amplified in the above manner.

The sequencing approach is not limited, as long as a sequence of a nucleic acid sample can be determined. While any approach known in the art can be utilized, it is preferable to use next generation sequencing (NGS). Examples of next generation sequencing include, but are not limited to, pyrosequencing, sequencing by synthesis, sequencing by ligation, ion semiconductor sequencing, and the like.

The obtained read data can be mapped to a reference sequence comprising V, D, and J genes to derive the number of unique reads and determine the BCR repertoire.

One embodiment prepares a reference database to be used for each of V, D, J, and C gene regions. Typically, a nucleic acid sequence data set for each allele or each region published by the IMGT is used, but the data set is not limited thereto. Any data set with a unique ID assigned to each sequence can be used.

The obtained read data (including those subjected to appropriate processing such as trimming as needed) is used as the input sequence set to search for homology with a reference database for each gene region, and an alignment with the closest reference allele and the sequence thereof are recorded. In this regard, an algorithm with high tolerance for a mismatch except for C is used for homology search. When a common homology search program BLAST is used, shortening of the window size, reduction in mismatch penalty, and reduction in gap penalty are set for each region. The closest reference allele is selected by using a homology score, alignment length, kernel length (length of consecutively matching base sequence), and number of matching bases as indicators, which are applied in accordance with a defined order or priority. For an input sequence with determined V and J used in the present disclosure, a CDR3 sequence is extracted with the front of CDR3 on reference V and end of CDR3 on reference J as guides. This is translated into an amino acid sequence for use in classification of a D region. When a reference database of a D region is prepared, a combination of results of homology search and results of amino acid sequence translation is used as a classification result.

In view of the above, each allele of V, D, J, and C is assigned for each sequence in an input set. The frequency of appearance by each of V, D, J, and C or frequency of appearance of a combination thereof is subsequently calculated in the entire input set to derive a BCR repertoire. The frequency of appearance is calculated in a unit of allele or unit of gene name depending on the precision required in classification. The latter is made possible by translating each allele into a gene name.

After a V region, D region, J region, and C region are assigned to read data, matching reads can be tallied to calculate the number of reads detected in a sample and the ratio to the total number of reads (frequency) for each unique read (read without the same sequence).

In addition, a diversity index or similarly index can be computed with a statistical analysis software such as ESTIMATES or R (vegan) by using data such as number of samples, read type, or the number of reads. In a preferred embodiment, TCR repertoire analysis software (Repertoire Genesis Inc.) is used.

As used herein, “BCR diversity” refers to diversity of the repertoire of B cell receptors of a subject. Those skilled in the art can determine the BCR diversity using various means known in the art. Any BCR diversity index that is known in the art can be used. Examples of BCR diversity indices include Shannon-Weaver index, Simpson index, inverse Simpson index, Pielou's evenness index, normalized Shannon-Weaver index, DE index (e.g., DE50 index, DE30 index, or DE80 index), or Unique index (e.g., Unique50 index, Unique30 index, or Unique80 index) applied to BCRs.

(Large-Scale High Efficiency BCR Repertoire Analysis)

A preferred embodiment of the present disclosure determines the BCR diversity using large-scale high efficiency BCR repertoire analysis. “Large-scale high efficiency repertoire analysis” herein is described in WO 2015/075939 (the entire disclosure thereof is incorporated herein by reference as needed) and is referred to as “large-scale high efficiency BCR repertoire analysis” when targeting BCRs. Large-scale high efficiency repertoire analysis is a method of quantitatively analyzing a repertoire (variable region of a T cell receptor (TCR) or B cell receptor (BCR)) of a subject by using a database. This method can be materialized by a method of quantitatively analyzing a repertoire of a variable region of a T cell receptor (TCR) or B cell receptor (BCR) by using a database, the method comprising (1) providing a nucleic acid sample comprising a nucleic acid sequence of the T cell receptor (TCR) or the B cell receptor (BCR) which is amplified from the subject in an unbiased manner; (2) determining the nucleic acid sequence comprised in the nucleic acid sample; and (3) computing a frequency of appearance of each gene or a combination thereof based on the determined nucleic acid sequence to derive a TCR or BCR repertoire of the subject, wherein the nucleic acid sample comprises nucleic acid sequences of a plurality of types of T cell receptor (TCR) or B cell receptor (BCR), and wherein (2) determines the nucleic acid sequence by single sequencing using a common adaptor primer. This method preferably comprises (1) providing a nucleic acid sample comprising a nucleic acid sequence of a T cell receptor (TCR) or a B cell receptor (BCR) which is amplified from the subject in an unbiased manner; (2) determining the nucleic acid sequence comprised in the nucleic acid sample; and (3) computing a frequency of appearance of each gene or a combination thereof based on the determined nucleic acid sequence to derive a repertoire of the subject, (1) comprising: (1-1) synthesizing a complementary DNA by using an RNA sample derived from a target cell as a template; (1-2) synthesizing a double stranded complementary DNA by using the complementary DNA as a template; (1-3) synthesizing an adaptor-added double stranded complementary DNA by adding a common adaptor primer sequence to the double stranded complementary DNA; (1-4) performing a first PCR amplification reaction by using the adaptor-added double stranded complementary DNA, a common adaptor primer consisting of the common adaptor primer sequence, and a first TCR or BCR C region specific primer, wherein the first TCR or BCR C region specific primer is designed to comprise a sequence that is sufficiently specific to a C region of interest of the TCR or BCR and not homologous to other genetic sequences, and comprise a mismatching base between subtypes downstream when amplified; (1-5) performing a second PCR amplification reaction by using a PCR amplicon of (1-4), the common adaptor primer, and a second TCR or BCR C region specific primer, wherein the second TCR or BCR C region specific primer is designed to have a sequence that is a complete match with the TCR or BCR C region in a sequence downstream the sequence of the first TCR C region specific primer, but comprise a sequence that is not homologous to other genetic sequences, and comprise a mismatching base between subtypes downstream when amplified; and (1-6) performing a third PCR amplification reaction by using a PCR amplicon of (1-5), an added common adaptor primer in which a nucleic acid sequence of the common adaptor primer comprises a first additional adaptor nucleic acid sequence, and an adaptor-added third TCR C region specific primer in which a second additional adaptor nucleic acid sequence and a molecule identification (MID Tag) sequence are added to a third TCR or BCR C region specific sequence; wherein the third TCR C region specific primer is designed to have a sequence that is a complete match with the TCR or BCR C region in a sequence downstream to the sequence of the second TCR or BCR C region specific primer, but comprise a sequence that is not homologous to other genetic sequences, and comprise a mismatching base between subtypes downstream when amplified, the first additional adaptor nucleic acid sequence is a sequence that is suitable for binding to a DNA capturing bead and for an emPCR reaction, the second additional adaptor nucleic acid sequence is a sequence that is suitable for an emPCR reaction, and the molecule identification (MID Tag) sequence is a sequence for imparting uniqueness such that an amplicon can be identified. The specific detail of this method is described in WO 2015/075939. Those skilled in the art can perform analysis by appropriately referring to this document, the Examples of the present specification, and the like.

(Diagnosis)

One embodiment of the present disclosure provides a method of using a B cell receptor (BCR) repertoire in a subject as an indicator of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in the subject. This method can comprise providing prognostic diagnosis or prediction on the development or ME/CFS, or an indicator for the development of a therapeutic agent for ME/CFS, in addition to diagnosis of a subject developing ME/CFS diagnosis. This can be an in vitro or in silico method.

The method can comprise using one or more variables comprising a variable associated with a BCR repertoire. If one or more variables are used as an indicator of ME/CFS, the method can be performed, for example, by computing a value of a suitable formula comprising the one or more variables by using the formula for a subject, comparing the value with a suitable criteria, and the like. The formula can be found by logistic regression or the like. Examples of variables used include, but are not limited to, one or more variables selected from a usage frequency of one or more genes in an IgG H chain variable region of a BCR in the subject, a BCR diversity index in the subject, and one or more immune cell subpopulation counts in the subject, and a combination of two or more variables.

An embodiment of the present disclosure can preferably use one or more variables comprising a usage frequency of one or more IGH genes of a BCR in a subject as an indicator of ME/CFS. This can enable use of the one or more variables as an indicator of a subject suffering from ME/CFS and not another disease (i.e., differential diagnosis). Although not wishing to be bound by any theory, the immune cell subpopulation count and BCR repertoire diversity index can also vary due to another disease that affects the entire immunological state, so that the variables can suggest the presence of ME/CFS in a subject, but may not be able to rule out other diseases. Meanwhile, the usage frequency of an IGH gene is understood to vary by reflecting the mechanism of some type of disease, so that the usage frequency can be an indicator of a subject suffering from ME/CFS and not another disease. Therefore, the usage frequency is understood to be very useful for diagnosis in actual clinical settings.

Examples of “another disease” with respect to ME/CFS include any disease that can affect the immunological status in addition to any disease that exhibits a symptom similar to “unexplained high level of fatigue, inordinate exhaustion after exertion, sleep disorder, or cognitive dysfunction that lasts 6 months or longer as the core symptom and often involving orthostatic intolerance, pain, digestive symptom, hypersensitivity to light, sound, odor, or chemical substance, or the like”, which is a symptom of ME/CFS.

Examples thereof include psychiatric disorders (e.g., depression, adjustment disorder, and somatoform disorder), primary sleep disorders (e.g., sleep apnea and narcolepsy), endocrine diseases (e.g., hypopituitarism and thyroid disease), infectious diseases (e.g., AIDS, hepatitis B, hepatitis C, and other chronic infectious diseases), autoimmune diseases (e.g., rheumatoid arthritis, systemic erythematodes, Sjogren's syndrome, and the like), inflammatory diseases (e.g., inflammatory bowel disease, chronic pancreatitis, and other chronic inflammatory diseases), and nervous system diseases (e.g., multiple sclerosis (MS) and autoimmune encephalitis). An embodiments of the present disclosure provides a method of distinguishing ME/CFS from another disease in a subject. Multiple sclerosis (MS) is an example of anther disease. The description regarding a method of using a B cell receptor (BCR) repertoire in a subject as an indicator of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in the subject can be applied as need to carry out the method of distinguishing ME/CFS from anther disease herein.

ME/CFS is diagnosed with a “combination of clinical symptoms”, in which a process of ruling out other diseases and causes is very critical. The “criteria for a combination of clinical symptoms” can also be satisfied in some cases by lifestyle habits (inconsistent schedule, severe exertion, etc.). The criteria for a clinical symptom can also be satisfied by other causes such as malignant tumor, metabolic disease, cranial nerve disease, or the like. For this reason, a B cell repertoire, which is understood to reflect the cause of a disease, and usage frequency of each gene therein are considered very useful in clinical diagnosis as shown in the Examples herein. For example, more accurate differential diagnosis can be provided by adding the differentiation method using a BCR repertoire (IGH gene usage frequency) described herein. Alternatively, clinical information (clinical observation, MRI image observation, cerebrospinal fluid observation, or the like) can be used in combination with the method of the disclosure as needed. One embodiment can ultimately identify one disease by determining the possibility of a patient, who is exhibiting a systemic symptom including ME/CFS, suffering from each disease with a differentiation model or differentiation formula using a usage frequency of each IGH gene.

As used herein, “sensitivity” refers to the probability of correctly determining what should be determined as positive as positive, with high sensitivity related to reduced false-positive. High sensitivity is useful in rule-out diagnosis.

As used herein, “specificity” refers to the probability of correctly determining what is negative as negative, with high specificity related to reduced false-positive. High sensitivity is useful in rule-in diagnosis.

As used herein, “ROC curve” refers to a curve plotting (sensitivity) and (1-specificity) when a cutoff value based on a regression formula of a marker is varied as a parameter. Those skilled in the art can appropriately find the AUC (area under the curve) under a ROC curve. It is understood that the AUC of a ROC curve indicates the performance of a prediction model. The Examples herein demonstrate a large number of variables or combinations of variables exhibiting high predictive performance of AUC % 0.7, AUC % 0.8, or AUC % 0.9 under a ROC curve in regression analysis. It is understood that a diagnostic model with an AUC of about 7 can be used.

Diagnosis of ME/CFS, which has been challenging up to this point, is enabled by use of one or more variables (can be a combination of variables as needed) used as an indicator of ME/CFS in a subject, exhibiting AUC % 0.7 under a ROC curve in regression analysis.

The method can comprise obtaining one or more variables used as an indicator of ME/CFS in a subject. In one embodiment, the step of obtaining a variable can comprise analyzing a sample of a subject such as determining a repertoire of a BCR in a subject and/or measuring the cell subpopulation count in a subject. The step of determining a repertoire of a BCR can comprise determining a BCR diversity in a subject and/or determining a usage frequency of one or more genes in an IgG H chain variable region of a BCR of a subject. The cell subpopulation count in a subject can be measured by any methodology that is known to those skilled in the art including flow cytometry. It is also possible to obtain data for a value determined for a subject previously.

The method can comprise determining whether a subject has ME/CFS based on one or more variables described herein. The determining step can be performed by comparing the values of dependent variables of a function including the one or more variables with a suitable threshold value.

In one embodiment, the method described herein is performed in silico. The method comprises obtaining one or more variables comprising a variable associated with a BCR repertoire of a subject and determining whether the subject has ME/CFS based on the one or more variables. Programs, storage media for recording a program, and systems implementing any method described herein are also within the scope of the disclosure.

One embodiment provides a program comprising an instruction which, when executed by one or more processors, causes the processors to obtain one or more variables comprising a variable associated with a BCR repertoire of a subject and to determine whether the subject has ME/CFS based on the one or more variables, or a storage medium for recording the program. A system comprising a recording unit configured to record information on one or more variables comprising a variable associated with a BCR repertoire of a subject and a determination unit configured to obtain said information and determine whether the subject has ME/CFS is also provided. The system can be a computer, and can comprise a program described herein or a storage medium for recording the program as needed.

(Indicator)

(IGH gene) One embodiment of the present disclosure provides a method of using one or more variables comprising a usage frequency of one or more genes (IGH gene) in an IgG H chain variable region of a BCR in a subject as an indicator of NE/CFS in the subject. As an IGH gene, at least one gene selected from the group consisting of IGHV1-2, IGHV1-3, IGHV1-8, IGHV1-18, IGHV1-24, IGHV1-38-4, IGHV1-45, IGHV1-46, IGHV1-58, IGHV1-69, IGHV1-69-2, IGHV1-69D, IGHV1/OR15-1, IGHV1/OR15-5, IGHV1/OR15-9, IGHV1/OR21-1, IGHV2-5, IGHV2-26, IGHV2-70, IGHV2-70D, IGHV2/OR16-5, IGHV3-7, IGHV3-9, IGHV3-11, IGHV3-13, IGHV3-15, IGHV3-16, IGHV3-20, IGHV3-21, IGHV3-23, IGHV3-230, IGHV3-25, IGHV3-30, IGHV3-30-3, IGHV3-30-5, IGHV3-33, IGHV3-35, IGHV3-38, IGHV3-38-3, IGHV3-43, IGHV3-43D, IGHV3-48, IGHV3-49, IGHV3-53, IGHV3-64, IGHV3-64D, IGHV3-66, IGHV3-72, IGHV3-73, IGHV3-74, IGHV3-NL1, IGHV3/OR15-7, IGHV3/OR16-6, IGHV3/OR16-8, IGHV3/OR16-9, IGHV3/OR16-10, IGHV3/OR16-12, IGHV3/OR16-13, IGHV4-4, IGHV4-28, IGHV4-30-2, IGHV4-30-4, IGHV4-31, IGHV4-34, IGHV4-38-2, IGHV4-39, IGHV4-59, IGHV4-61, IGHV4/OR15-8, IGHV5-10-1, IGHV5-51, IGHV6-1, IGHV7-4-1, IGHV7-81, IGHD1-1, IGHD1-7, IGHD1-14, IGHD1-20, IGHD1-26, IGHD1/OR15-1a/b, IGHD2-2, IGHD2-8, IGHD2-15, IGHD2-21, IGHD2/OR15-2a/b, IGHD3-3, IGHD3-9, IGHD3-10, IGHD3-16, IGHD3-22, IGHD3/OR15-3a/b, IGHD4-4, IGHD4-11, IGHD4-17, IGHD4-23, IGHD4/OR15-4a/b, IGHD5-5, IGHD5-12, IGHD5-18, IGHD5-24, IGHD5/OR15-5a/b, IGHD6-6, IGHD6-13, IGHD6-19, IGHD6-25, IGHD7-27, IGHJ1, IGHJ2, IGHJ3, IGHJ4, IGHJ5, IGHJ6, IGHG1, IGHG2, IGHG3, IGHG4, and IGHGP, and any combination thereof can be used.

The Examples herein show that usage frequencies of various IGH genes can be an indicator of ME/CFS. In the present disclosure, the number of at least one IGH genes that are used is not particularly limited. Any number of genes from 1 to 117 genes can be used. One or more variables used as an indicator in the present disclosure can comprise a usage frequency of about 1, about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, about 15, about 20, about 30, about 40, about 50, about 60, about 70, about 80, about 90, about 100, about 110, or a greater number of IGH genes. When suitable, one or more variables can also be one variable.

In a preferred embodiment, one or more IGH genes comprise at least one gene selected from the group consisting of IGHV3-73, IGHV1-69-2, IGHV5-51, IGHV4-31, IGHV3-23D, IGHV1/OR15-9, IGHV4-39, IGHD5-12, IGHV3-43D, IGHD4-17, IGHV5-10-1, IGHD4/OR15-4a/b, IGHG4, IGHV1/OR15-5, IGHV3/OR16-9, IGHD1-7, IGHV3-21, IGHD6-6, IGHV3-33, IGHD4-23, IGHV3-30-5, IGHV3-23, IGHD6-13, IGHV3-64D, IGHV3-48, IGHV3-64, IGHG1, IGHV3-49, IGHV3-30-3, IGHD1-26, IGHJ6, IGHV3-30, IGHGP, IGHV1-3, and IGHD3-22. More preferably, one or more IGH genes comprise at least one gene selected from the group consisting of IGHD6-6, IGHV3-33, IGHD4-23, IGHV3-30-5, IGHV3-23, IGHD6-13, IGHV3-64D, IGHV3-48, IGHV3-64, IGHG1, IGHV3-49, IGHV3-30-3, IGHD1-26, IGHJ6, IGHV3-30, IGHGP, IGHV1-3, and IGHD3-22. Still more preferably, one or more IGH genes comprise at least one gene selected from the group consisting of IGHV3-49, IGHV3-30-3, IGHD1-26, IGHJ6, IGHV3-30, IGHGP, IGHV1-3, and IGHD3-22. In another embodiment, one or more genes comprise at least one gene selected from the group consisting of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ6. The Examples herein suggest that the aforementioned genes can be used alone for prediction.

Another embodiment of the present disclosure provides a method using variables comprising usage frequencies of two or more genes in an IgG H chain variable region of a BCR in a subject as one or more variables. It is understood that the precision of the method can be further enhanced by using usage frequencies of two more genes. Those skilled in the art can select and use a combination of usage frequencies of two or more genes that are suitable in view of the descriptions herein. One or more variables can be selected so that AUC 0.7, AUC 0.8, or AUC 0.9 is exhibited under a ROC curve in regression analysis. An example of a combination of genes include IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ6. Those skilled in the art can appropriately compute AUC under a ROC curve in regression analysis for a combination of genes in accordance with the method described herein, and determine whether the combination of genes exhibits a desired AUC. For example, regression analysis can be for differentiating a normal control from ME/CFS.

Examples herein show, for a combination of two IGH genes, that any of the 117 genes can materialize a combination exhibiting AUC 0.7 under a ROC curve in regression analysis when combined with another suitable IGH gene.

(Diversity Index)

The present disclosure can use a BCR diversity index in a subject as a variable, in place of or in addition to another variable in diagnosis of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in a subject.

Any index that is known in the art can be used as a BCR diversity index. Examples thereof include the Shannon-Weaver index, Simpson index, inverse Simpson index, Pielou's evenness index, normalized Shannon-Weaver index, DE index (e.g., DE50 index, DE30 index, and DE80 index), and the like.

In Example 3-2 and Table 7 herein, some diversity indices are extracted with significance of 0.1≤P<0.2 in simple regression analysis. Example 6 herein has found that some of the multiple combinations of variables comprising a diversity index exhibits an AUC value of 0.8 or greater in ROC analysis (Table 18).

(Cell Subpopulation)

The present disclosure can use a cell subpopulation count as a variable in place of or in addition to another variable in the diagnosis of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in a subject. For example, an immune cell subpopulation count can be used as a cell subpopulation count. As used herein, “cell subpopulation” refers to any population of cells having some type of a common feature among a cell population comprising cells with diverse characteristics. For a cell subpopulation with a specific name that is known in the art, the cell subpopulation can be mentioned using such a term. A specific cell subpopulation can also be mentioned by describing any property (e.g., expression of cell surface marker).

Examples of cell subpopulations include, but are not limited to, B cells, naïve B cells, memory B cells, plasmablasts, activated naïve B cells, transitional B cells, regulatory T cells, memory T cells, follicular helper T cells, Tfh1 cells, Tfh2 cells, Tfh17 cells, Th1 cells, Th2 cells, and Th17 cells.

The cell subpopulation count can be determined, for example, by flow cytometry by those skilled in the art. For example, the count can be observed by collecting a sample, reacting the sample with a fluorescently labeled antibody (an antibody to antigen of interest and a control antibody thereof) after removing red blood cells by hemolysis or specific gravity centrifugation, thoroughly washing the sample, and using flow cytometry. Detected scattered light or fluorescence is converted into an electric signal and analyzed with a computer. As a result, the intensity of FSC represents the size of cells, and the intensity of SSC represents the intracellular structure, which enable distinguishing lymphocytes, monocytes, and granulocytes. Subsequently, the cell population of interest is gated as needed to study the antigen expression form in the cells. In practicing the method of the present disclosure, those skilled in the art can appropriately identify the indicated cell surface marker to fractionate or count the cells.

Examples of particularly useful cell subpopulations include B cells and regulatory T cells (Treg). The present disclosure can use one or more variables comprising the B cell count and/or Treg count in addition to or in place of another variable as an indicator of ME/CFS.

As the cell subpopulation count, the ratio with respect to a suitable baseline can be used. The B cell count is, for example, the frequency)(° of B cells in peripheral blood mononuclear cells. The Treg count is, for example, the frequency (5) of Treg in all CD4 positive T cells.

In Examples 3-2 and Table 7 herein, some cell subpopulation variables are extracted with significance at 0.1≤P<0.2 in simple regression analysis. Example 6 herein has found that some of the multiple combinations of variables comprising a cell subpopulation variable exhibits an AUC value of 0.8 or greater in ROC analysis (Table 18).

(Combination)

As shown herein, one or more variables comprising a usage frequency of one or more genes in an IgG H chain variable region of a BCR in a subject can be used as an indicator of ME/CFS in the subject. One or more variables can be any combination of variables described herein. Preferably, the variables comprise a combination of two or more variables selected from the group consisting of a usage frequency of one or more genes in an IgG H chain variable region of a BCR in a subject, a BCR diversity index in a subject, and one or more immune cell subpopulation counts in a subject. Two or more variables can be three or more, four or more, five or more, six or more, or any greater number of variables.

The Examples herein demonstrate that a combination of a large number of variables exhibits a high AUC under a ROC curve in regression analysis for differentiating a normal control from ME/CFS. Those skilled in the art can appropriately combine variables and use a combination exhibiting a specified AUC. In one embodiment, a combination of variables can exhibit AUC k 0.7, AUC 0.8, AUC 0.85, AUC 0.9, AUC 0.95, or AUC 0.99 under a ROC curve in regression analysis. Those skilled in the art can appropriately compute the AUC under a ROC curve in regression analysis for a particular combination of variables in accordance with the method described herein, and determine whether the particular combination of variables exhibits a desired AUC. Examples of combinations of variables include, but are not limited to, a combination of usage frequencies of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ6 and B cell count in a subject, a combination of usage frequencies of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ6 and Treg count in a subject, and the like.

Although not essential, further combining a variable shown to be predictive by simple regression analysis can lead to higher predictive performance. In the Examples herein, 46% of combination of two variables, for combinations of variables exhibiting a significant difference at p<0.2 between a patient group and a control group in simple regression analysis, exhibited AUC 0.7 in ROC analysis, 81% of combinations of three variables exhibited AUC 0.7 in ROC analysis, and 96% of combinations of four variables exhibited AUC 0.7 in ROC analysis.

In the present disclosure, a subject can be diagnosed as suffering from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) by, for example, the following steps. A plurality of variables comprising a usage frequency of one or more genes in an IgG H chain variable region of a BCR can be provided with regard to myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS). Next, a differentiation formula generated by multivariate analysis for said variables can be provided. Providing the differentiation formula can comprise, for example, performing univariate or multivariate logistic regression for the variables, with distinction between patient/healthy individual as an objective variable; computing a constant and coefficient of a differentiation formula from a constant and a partial regression coefficient of a logit model formula generated in the logistic regression; and generating a differentiation formula based on the constant and the coefficient obtained from said processing. Next, a value of a variable for a subject can be fitted into the differentiation formula to compute the probability of suffering from ME/CFS. If the probability of suffering from ME/CFS is greater than a predetermined value, the subject can be determined as suffering from ME/CFS.

More specifically, a subject can be diagnosed as suffering from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in the present disclosure by, for example, the following steps. In the method, a combination of variables is specified. Specifying the combination of variables can comprise (1) comparing a healthy individual and a myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) patient and providing a plurality of variables comprising a usage frequency of one or more genes in an IgG H chain variable region of a BCR for which a significant difference is detected, for subjects including the ME/CFS patient and the healthy individual; and/or (2) performing univariate logistic analysis using a gene in an IgG H chain variable region of the BCR as an independent variable or multivariate logistic analysis using two or more genes in an IgG H chain variable region of the BCR as independent variables to obtain a logistic regression model formula, and performing ROC analysis for measuring a degree of fit of the logistic regression model formula to select a gene exhibiting a higher AUC value as a variable for a differentiation formula

Next, a differentiation formula generated by multivariate analysis for the variables for the differentiation formula can be provided. Providing the differentiation formula can comprise, for example, performing univariate or multivariate logistic regression for the variables, with distinction between patient/healthy individual as a response variable; computing a constant and a coefficient of a differentiation formula from a constant and a partial regression coefficient of a logit model formula generated in the logistic regression; and generating a differentiation formula based on the constant and the coefficient obtained from said processing. Next, a probability of suffering from ME/CFS can be computed by fitting values of the variables of the subject into the differentiation formula. The subject can be determined as suffering from ME/CFS if the probability of suffering from ME/CFS is greater than a predetermined value.

(Differential Diagnosis)

An embodiment of the present disclosure provides a method of distinguishing ME/CFS from another disease in a subject. In one embodiment of the present disclosure, differential diagnosis is performed by using a formula comprising an indicator (variable) resulting in a difference between a non-ME/CFS group, including a normal control and a patient of another disease, and an ME/CFS group. In such a case, a differentiation formula including IGH gene that is selected as being effective in differentiation from the ME/CFS group, with healthy individual MS as the non-ME/CFS group, can be used. IGH genes with a significant difference are found from a significance test between the ME/CFS group and non-ME/CFS group in the Examples herein. As one or more variables, variables comprising a usage frequency of at least one gene selected from the group consisting of IGHV3-49, IGHV3-30-3, IGHD1-26, IGHV3-30, IGHJ6, IGHGP, IGHV4-31, IGHV3-64, IGHD3-22, IGHV3-33, IGHV3-73, IGHV5-10-1, and IGHV4-34 can be used.

Alternatively, another embodiment envisions a form where, after differentiation with a differentiation formulation for distinguishing a normal control from an ME/CFS patient, each of the differentiation formulas for differentiating from another disease (e.g., MS) is fitted to rule out the possibility of having another disease. In such a case, the differentiation for distinguishing ME/CFS from a normal control described herein is performed, and subsequently (or before or concurrently), yet another differentiation formula is used for differentiation from another disease.

Specifically, a method comprising: (a) using a part of the one or more variables as an indicator of ME/CFS in the subject; and (b) using a part of the one or more variables as an indicator of the subject suffering from ME/CFS, and not another disease, can be provided. Another disease can comprise multiple sclerosis (MS). (b) can be performed a plurality of times for a plurality of other diseases.

One embodiment of the present disclosure is a method of using one or more variables comprising a usage frequency of one or more genes in an IgG H chain variable region of a BCR in a subject as an indicator of the subject suffering from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), and not another disease (e.g., multiple sclerosis, MS). Such a method can be performed in combination with differentiation for distinguishing ME/CFS from, a normal control described herein as needed. In this regard, if it is desirable to distinguish MS, one or more genes can comprise at least one gene selected from the group consisting of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHD1-26, IGHV3-49, and IGHJ6. In addition, the one or more genes can be genes selected so that AUC 0.7, AUC 0.8, or AUC ≥0.9 is exhibited under a ROC curve in regression analysis for differentiating MS from, ME/CFS. In the present disclosure, the number of at least one IGH genes used for differentiation of MS from ME/CFS is not particularly limited. Any number of genes from 1 to 117 genes can be used. The one or more variables used as an indicator in the present disclosure can comprise a usage frequency of about 1, about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, about 15, about 20, about 30, about 40, about 50, about 60, about 70, about 80, about 90, about 100, about 110, or a greater number of IGH genes. When suitable, one or more variables can also be one variable.

As used herein, “subject” refers to any organism that can be subjected to the diagnosis, detection, or the like of the present disclosure, and is preferably a human.

As used herein, “sample” refers to any substance obtained from, a subject. Examples thereof include peripheral blood, tissue biopsy sample, cell sample, lymph, saliva, urine, and the like. Those skilled in the art can appropriately select a preferred sample based on the descriptions herein.

As used herein, “or” is used when “at least one or more” of the listed matters in the sentence can be employed. When explicitly described herein as “within the range” of “two values”, the range also includes the two values themselves.

Reference literatures such as scientific literatures, patents, and patent applications cited herein are incorporated herein by reference to the same extent that the entirety of each document is specifically described.

As described above, the present disclosure has been described while showing preferred embodiments to facilitate understanding. The present disclosure is described hereinafter based on Examples. The above descriptions and the following Examples are not provided to limit the present disclosure, but for the sole purpose of exemplification. Thus, the scope of the present disclosure is not limited to the embodiments and Examples specifically described herein and is limited only by the scope of claims.

EXAMPLES

The Examples described hereinafter in the present specification use the following abbreviations when appropriate.

TABLE A Denotation Formal name Computation method IGHV Immunoglobulin Usage frequency (%) of each IGHV heavy chain V gene in all IGHV region IGHD Immunoglobulin Usage frequency (%) of each IGHD heavy chain D gene in all IGHD region IGHJ Immunoglobulin Usage frequency (%) of each IGHJ heavy chain J gene in all IGHJ region IGHG Immunoglobulin Usage frequency (%) of each IGHG heavy chain G gene in all IGHV chain C region Shannon Shannon index $H^{\prime} = {- {\sum\limits_{i = 1}^{S}{\frac{n_{i}}{N}\ln\;\frac{n_{i}}{N}}}}$ wherein N: total number of reads, n_(i): number of ith unique read, S: number of unique reads Inverse Inverse Inverse of Simpson index (1 − λ) Simpson index ${1 - \lambda} = {1 - {\sum\limits_{i = 1}^{S}\left( \frac{n_{i}\left( {n_{i} - 1} \right)}{N\left( {N - 1} \right)} \right)}}$ Pielou Pielou's Same as normalized shannon index evenness index found by the following computational formula $H^{\prime} = {- {\sum\limits_{i = 1}^{S}{\frac{n_{i}}{N}\ln\;\frac{n_{i}}{N}\text{/}\ln\mspace{14mu} N}}}$ DE50 DE50 index Ratio of the number of top unique reads accounting for 50% of all reads to the number of all unique reads B cell B cell count Frequency (%) in peripheral blood mononuclear cells nB Naïve B cell Frequency (%) in B cells count mB Memory B cell Frequency (%) in B cells count PB Plasmablast Frequency (%) in B cells count aN Activated Frequency (%) in B cells naïve B cell count TrB Transitional B Frequency (%) in B cells cell count Treg Regulatory T Frequency (%) in CD4 positive T cell count cells mCD4T Memory T cell Frequency (%) in CD4 positive T count cells Tfh follicular Frequency (%) in CD4 positive T helper T cell cells count Tfh1 Tfh1 cell Frequency (%) in Tfh cells count Tfh2 Tfh2 cell Frequency (%) in Tfh cells count Tfh17 Tfh17 cell Frequency (%) in Tfh cells count Th1 Th1 cell count Frequency (%) in CD4 positive T cells Th2 Th2 cell count Frequency (%) in CD4 positive T cells Th17 Th17 cell Frequency (%) in CD4 positive T count cells

Example 1

Next generation B cell receptor repertoire analysis using ME/CFS patient peripheral blood mononuclear cells

1. Materials and Methods 1.1. Separation of Peripheral Blood Mononuclear Cells and RNA Extraction

10 mL of whole blood was collected into a heparin containing blood collection tube from the ME/CFS patients (37 cases) and healthy controls (23 cases) shown in Table 1, and peripheral blood mononuclear cells (PBMCs) were separated by Ficoll-Paque PLUS density gradient centrifugation. Total RNA was extracted/purified using RNeasy Lipid Tissue Mini Kit (Qiagen, Germany) from the isolated PBMCs. RNA was quantified using an Agilent 2100 Bioanalyzer (Agilent)

TABLE 1 ME/CFS patients ME/CFS patients Healthy controls (37 cases) (23 cases) p value Age (mean ± SD) 38.8 ± 11.4 39.4 ± 8.2 0.99 Sex (male:female) 8:29 7:16 0.54 Duration of disease 10.9 (1-33) — — (years, mean (range)) Performance status 5.9 ± 1.6 — — (mean ± SD) Onset triggered by 22 cases — — infection-like (59.5%) symptom (cases, %) Performance status evaluated the severity in 10 levels from 0 to 9, 9 being the most severe. For p value, age was computed by Mann-Whitney U-test, and sex was computed by Fisher's exact test, where p<0.05 was considered significant.

1.2. Complementary DNA and Double-Stranded Complementary DNA Synthesis

BCR genes were amplified using adaptor-ligation PCR. Complementary DNA (cDNA) was synthesized using a restriction enzyme digestion site-containing Oligo dT primer (BSL-18E: Table 2) and a reverse transcriptase. Subsequently, E. coli DNA Ligase (invitrogen), DNA polymerase I (Invitrogen), and RNaseH (Invitrogen) were used for double stranded complementary DNA (ds-cDN) synthesis. A 5′ end blunting reaction was then performed using a T4 DNA polymerase, and the end was cleaved with a restriction enzyme NotI. After column purification using a MinElute Reaction Cleanup Kit (QIAGEN), a P20EA/P10EA adaptor was added by a ligation reaction using a T4 ligase. The adaptor added ds-cDNA was digested by NotI restriction enzyme.

1.3. PCR

To specifically amplify a heavy chain (IGH) gene of an immunoglobulin γ chain (IgG) of a B cell receptor (BCR), nested PCR was performed three times using a thermal cycler 1100 (Bio Rad) with KAPA HiFi HS ReadyMix (Nippon Genetics). P20EA and CG1 were used to perform the 1st PCR, and P20EA and CG2 were subsequently used to perform the 2nd PCR reaction. Furthermore, a tag sequence required for sequencing was added with P22EA-ST1-R and CG-ST1-R. After removing exogenous residual primer using Agencourt AMPure beads, index was added using Nextera XT Index Kit v2 Set A (Illumina).

TABLE 2 Primer sequences Primer Sequence BSL18E AAAGCGGCCGCATGCTTTTTTTTTTTTTTTTTTVN P20EA GGGAATTCGG P10EA TAATACGACTCCGAATTCCC CG1 CACCTTGGTGTTGCTGGGCTT CG2 TCCTGAGGACTGTAGGACAGC P22EA-ST1- GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGC R TAATACGACTCCGAATTCCC CG-ST1-R TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGTG AGTTCCACGACACCGTCAC (corresponding to SEQ ID NO: 1 to 7 from the top)

1.4. Next Generation Sequence Analysis

The concentrations of amplicons were measured using Qubit® 3.0 fluorometer (Thermo Fisher Scientific). After diluting the amplicons to 4 nM, PhiX Control v3 (Illumina) was partially mixed to prepare the final prepared specimen. Paired-end sequencing was performed with Miseq Reagent Kit v3 (600 Cycle, Illumina) and the final prepared specimen using Illumina's Miseq sequencer.

1.5. Analysis Using Repertoire Analysis Software

The IGH gene V region sequence (IGHV), D region sequence (IGHD), J region sequence (IGHJ), and C region sequence (IGHC) were collated and the CDR3 sequence was determined with repertoire analysis software Repertoire Genesis by using a pair of Fastq base sequence data sets acquired by Miseq sequencing. The number of copies of unique reads in each sample was counted, with reads having the same IGHV, IGHD, IGHJ, IGHC, and CDR3 amino acid sequences being a unique read. The diversity index and usage frequencies of IGHV, IGHD, IGHJ, and IGHC of each sample were computed from the results of tallying unique reads.

2. Results

Sequence data for 200 thousand to 300 thousand reads was acquired from samples of ME/CFS patients and healthy controls (Table 3). A difference was not found in the total number of reads, assigned reads, number of in-frame reads, and number of unique reads between ME/CFS patients and healthy controls. IGHV, IGHD, IGHJ, and IGHC usage frequencies and diversity index computed from read data were compared between ME/CFS patients and healthy controls. For IGHV, IGHV1-3, IGHV3-30, IGHV3-30-3, and IGHV3-49 usage frequencies were significantly higher in ME/CFS patients as compared to healthy controls (FIG. 1, significance level of Mann-Whitney test; P<0.05, P<0.05, P<0.001, and P<0.001). IGHD1-26 for IGHD, and IGHJ6 for IGHJ exhibited a significantly higher frequency in ME/CFS patients as compared to healthy controls (FIG. 2, P<0.01 and P<0.05). Dot plots are shown in FIGS. 3A and 3B. These results suggest that measurement of usage frequencies of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHD1-26, IGHV3-49, and IGHJ6 by next generation BCR repertoire analysis can be utilized in the differentiation of ME/CFS, which does not have an effective biomarker (Table 4).

TABLE 3 Number of reads acquired by NGS sequencing Number of reads (mean ± SD) ME/CFS patients Healthy controls (37 cases) (23 cases) P value Total reads  289342.6 ± 112471.3  304907.2 ± 146736.6 0.67 Assigned reads 162337.5 ± 40656.5 166100.2 ± 56946.8 0.78 In-frame reads  158459 ± 39046.6 162362.7 ± 54851.7 0.77 Unique reads 16936.5 ± 9067.3 15377.6 ± 6703.1 0.45

TABLE 4 Biomarker candidate I for ME/CFS differentiation Use Indicator Variable Independent % IGHV1-3, IGHV3-30, IGHV3-30-3, frequency IGHD1-26, IGHV3-49, IGHJ6

Example 2

Comparison and investigation of ME/CFS patient lymphocytes by flow cytometry

1. Method

Peripheral blood mononuclear cells (PBMCs) were separated by the method described in [Example 1]. The cells were then stained with various fluorescent dye labeled monoclonal antibodies, and the frequencies of the following lymphocyte subfractions were calculated with a flow cytometer (FACS Canto II and FACS Aria II flow cytometer (BD Biosciences)). (5) B cells: CD19+ cells/PBMC; naïve B cells (nB): CD19+CD27−/CD19+ cells; memory B cells (mBs), CD19+CD27+CD180+/CD19+; plasmablasts (PBs): CD19+CD27+CD180−CD38high/CD19+; transitional B cells (TrB): CD19+CD27−CD24+Mito tracker green high/CD19+; memory CD4T cells (mCD4T): CD3+CD4+CD127+CD45RA−/CD3+CD4+; follicular helper T cells (Tfh): CD3+CD4+CD127+CD45RA−CXCR5+/CD3+CD4+; helper T cells 1 (Th1 cells): CD3+CD4+CD127+CD45RA−CXCR5−CXCR3+CCR6−/CD3+CD4+CD127+CD45RA−; helper T cells 2 (Th2 cells): CD3+CD4+CD127+CD45RA−CXCR5−CXCR3−CCR6−/CD3+CD4+CD127+CD45RA−; helper T cells 17 (Th17 cells): CD3+CD4+CD127+CD45RA−CXCR5−CXCR3−CCR6+/CD3+CD4+CD127+CD45RA−; follicular helper T cells 1 (Tfh1 cells): CD3+CD4+CD127+CD45RA−CXCR5+CXCR3+CCR6−/CD3+CD4+CD127+CD45RA−CXCR5+; follicular helper T cells 2 (Tfh2 cells): CD3+CD4+CD127+CD45RA−CXCR5+CXCR3−CCR6−/CD3+CD4+CD127+CD45RA−CXCR5+; follicular helper T cells 17 (Tfh17 cells): CD3+CD4+CD127+CD45RA−CXCR5+CXCR3−CCR6+/CD3+CD4+CD127+CD45RA−CXCR5+; regulatory T cells (Treg): CD3+CD4+CD45RA−CD127−CD25++/CD3+CD4+. A significant difference between the ME/CFS patients and healthy subjects in the resulting frequencies between the two groups was tested by Mann Whitney U test. P<0.05 was considered statistically significant.

2. Results

As show in FIGS. 4, 5, and 6, (%) B cells was significantly higher in the disease group, the frequency of regulatory T cells (Treg) was significantly lower in the disease group, and the frequency of follicular helper T cells 17 (Tfh17) was significantly higher in the disease group.

Example 3-1

Prediction and differentiation of ME/CFS utilizing IGH gene and cell subpopulation frequency data observed to have a significant difference between the two groups

1. Method

Frequency data for cell subpopulations and six IGH genes (Table 5) for which a significant difference was detected between ME/CFS patients (37 cases) and healthy controls (23 cases), i.e., (%) B cells or (8) regulatory T cells, were added to study whether ME/CFS can be predicted. Multivariate logistic analysis was performed using SPSS software (IBM). A Receiver Operating Characteristic (ROC) curve was created using prediction values for dependent variables of a regression formula. The ROC curve Area Under the Curve (AUC) value was found as a performance evaluation value for the prediction and determination of these variables.

2. Results

High evaluation was obtained at an AUC value of 0.946 for 6 IGH gene variables and (%) B cells (FIG. 7). A very high evaluation was obtained at an AUC value of 0.957 in the analysis based on a regression formula using 6 IGH gene variables and (8) regulatory T cells (FIG. 8).

TABLE 5 Biomarker candidate II for ME/CFS differentiation Use Indicator Variable Composite, % frequency IGHV1-3, IGHV3-30, IGHV3-30-3, logistic IGHD1-26, IGHV3-49, IGHJ6, B-cell regression formula Composite, % frequency IGHV1-3, IGHV3-30, IGHV3-30-3, logistic IGHD1-26, IGHV3-49, IGHJ6, Treg regression formula

Example 3-2

Extraction of variables for IGH genes and cell subpopulations with high predictive performance of ME/CFS by simple regression analysis

1. Method

Simple regression analysis was performed using one out of 74 types of IGHV, 32 types of IGHD, 6 types of IGHJ, 5 types of IGHC, 4 types of diversity indices, and 15 types of cell subpopulation frequency data as an independent variable and a binary variable between ME/CFS and healthy control as a dependent variable. The variables used are shown in Table 6. Variables satisfying the significance level P<0.05, 0.05≤P<0.1, and 0.1≤P<0.2 were extracted. Simple regression and logistic regression analysis utilized glm 0 of R, and ROC analysis utilized pROC of the R package.

TABLE 6 Variables used in regression analysis and ROC analysis Classification Name of variable IGHV IGHV1-2, IGHV1-3, IGHV1-8, IGHV1-18, IGHV1- 24, IGHV1-38-4, IGHV1-45, IGHV1-46, IGHV1-58, IGHV1-69, IGHV1-69-2, IGHV1-69D, IGHV1/OR15- 1, IGHV1/OR15-5, IGHV1/OR15-9, IGHV1/OR21-1, IGHV2-5, IGHV2-26, IGHV2-70, IGHV2-70D, IGHV2/OR16-5, IGHV3-7, IGHV3-9, IGHV3-11, IGHV3-13, IGHV3-15, IGHV3-16, IGHV3-20, IGHV3-21, IGHV3-23, IGHV3-23D, IGHV3-25, IGHV3-30, IGHV3-30-3, IGHV3-30-5, IGHV3-33, IGHV3-35, IGHV3-38, IGHV3-38-3, IGHV3-43, IGHV3-43D, IGHV3-48, IGHV3-49, IGHV3-53, IGHV3-64, IGHV3-64D, IGHV3-66, IGHV3-72, IGHV3-73, IGHV3-74, IGHV3-NL1, IGHV3/OR15-7, IGHV3/OR16-6, IGHV3/OR16-8, IGHV3/OR16-9, IGHV3/OR16-10, IGHV3/OR16-12, IGHV3/OR16-13, IGHV4-4, IGHV4-28, IGHV4-30-2, IGHV4-30-4, IGHV4-31, IGHV4-34, IGHV4-38-2, IGHV4-39, IGHV4-59, IGHV4-61, IGHV4/OR15-8, IGHV5-10-1, IGHV5-51, IGHV6-1, IGHV7-4-1, IGHV7-81 (total of 74 types) IGHD IGHD1-1, IGHD1-7, IGHD1-14, IGHD1-20, IGHD1- 26, IGHD1/OR15-1a/b, IGHD2-2, IGHD2-8, IGHD2- 15, IGHD2-21, IGHD2/OR15-2a/b, IGHD3-3, IGHD3-9, IGHD3-10, IGHD3-16, IGHD3-22, IGHD3/OR15-3a/b, IGHD4-4, IGHD4-11, IGHD4-17, IGHD4-23, IGHD4/OR15-4a/b, IGHD5-5, IGHD5-12, IGHD5-18, IGHD5-24, IGHD5/OR15-5a/b, IGHD6-6, IGHD6-13, IGHD6-19, IGHD6-25, IGHD7-27 (total of 32 types) IGHJ IGHJ1, IGHJ2, IGHJ3, IGHJ4, IGHJ5, IGHJ6 (total of 6 types) IGHC IGHG1, IGHG2, IGHG3, IGHG4, IGHGP (total of 5 types) Diversity Shannon, Inverse, Pielou, DE50 (total of 4 types) index Cell B cell, nB, mB, PB, aN, TrB, Treg, mCD4T, Tfh, subpopulation Tfh1, Tfh2, Tfh17, Th1, Th2, Th17 (total of 15 types) data

2. Results

From 74 types of IGHV, 32 types of IGHD, 6 types of IGHJ, 5 types of IGHC, 4 types of diversity indices, and 15 types of cell subpopulation frequency data for a total of 136 types of variables, 8 types of IGH genes and 2 types of cell subpopulation variables for a total of 8 types were extracted as variable satisfying the significance level of p<0.05 (Table 7). 10 types of IGH genes and 4 types of cell subpopulation frequency data were variables satisfying 0.05≤P<0.1. 16 types of IGH genes, 3 types of diversity indices, and 2 types of cell subpopulation variables were variables satisfying 0.1≤P<0.2. It is suggested that variables satisfying the significance level of P<0.2 are independently effective in predicting ME/CFS. In particular, it is suggested that variables satisfying the significance level of P<0.05 are independently very effective in predicting ME/CFS (Table 8). Variables satisfying the significance level of P<0.2 were used in analysis using multivariate logistic regression analysis.

When ROC analysis was performed with some of the variables independently, AUC values of IGHV3-49: 0.764, IGHV3-30-3: 0.759, IGHD1-26: 0.731, IGHJ6: 0.672, IGHV3-30: 0.693, IGHV1-3: 0.659, B cell: 0.771, Treg: 0.817, Shannon: 0.617, and inverse: 0.636 were obtained.

TABLE 7 Variables satisfying the significance level in simple regression analysis Significance level Variable P < 0.05 (IGH) IGHV3-49, IGHV3-30-3, IGHD1-26, IGHJ6, IGHV3-30, IGHGP, IGHV1-3, IGHD3-22 (total of 8 types) (Cell subpopulation) B cell, Treg (total of 2 types) 0.05 ≤ P < 0.1 (IGH) IGHD6-6, IGHV3-33, IGHD4-23, IGHV3-30- 5, IGHV3-23, IGHD6-13, IGHV3-64D, IGHV3-48, IGHV3-64, IGHG1 (total of 10 types) (Cell subpopulation) Tfh17, mB, Tfh2, nB (total of 4 types)  0.1 ≤ P < 0.2 (IGH) IGHV3-73, IGHV1-69-2, IGHV5-51, IGHV4- 31, IGHV3-23D, IGHV1/OR15-9, IGHV4-39, IGHD5-12, IGHV3-43D, IGHD4-17, IGHV5-10-1, IGHD4/OR15-4a/b, IGHG4, IGHV1/OR15-5, IGHV3/OR16-9, IGHD1-7, IGHV3-21 (total of 17 types) (Diversity index) Shannon, Inverse, Pielou (total of 3 types) (Cell subpopulation) Th17, aN (total of 2 types)

TABLE 8 Biomarker candidate III for ME/CFS differentiation Use Indicator Variable Independent, % frequency IGHV3-49, IGHV3-30-3, IGHD1-26, regression IGHJ6, IGHV3-30, IGHGP, IGHV1-3, formula IGHD3-22, B-cell, Treg

Example 4 Prediction and Differentiation of ME/CFS Using Two Types of IGH Genes 1. Method

Multivariate logistic regression analysis was performed for a combination of any two types of IGH genes from 74 types of IGHV, 32 types of IGHD, 6 types of IGHJ, and 5 types of IGHC for a total of 117 types of IGH genes, and ROC curves were created using prediction values for dependent variables of each regression formula. An AUC value exhibiting predictive performance of ME/CFS was computed with respect to the ROC curves. Combinations of IGH genes exhibiting an AUC value of 0.7 or 0.8 or greater and a list of IGH genes used in the combinations were extracted. Logistic regression analysis utilized glm( ) of R, and ROC analysis utilized pROC of the R package.

2. Results

Combinations of IGH genes exhibiting an AUC value of 0.8 or 0.7 or greater in ROC analysis using any two types of IGH genes were 30 and 637 sets, respectively. Table 11 shows combinations of variables exhibiting AUC ≥0.7 by binary logistic regression and ROC analysis. 26 IGH genes used in the combinations had an AUC value of 0.8 or greater, and 117 genes had an AUC value of 0.7 or greater. (Table 9). 10 (33%) of the 30 sets of gene combinations exhibiting an AUC value of 0.8 or greater are combinations of variables exhibiting significance in simple regression analysis, and 20 sets (67%) had one of the IGHs exhibiting significance in simple regression analysis (Table 10). These results show that ME/CFS patients can be predicted and differentiated by combining any IGH genes.

TABLE 9 IGH genes that can be utilized in prediction and differentiation of ME/CFS AUC IGHV3-49, IGHV3-30-3, IGHGP, IGHD6-6, IGHV3-30, value ≥ IGHV3-64D, IGHD1-26, IGHG4, IGHV1-3, IGHV1-69- 0.8 2, IGHV1/OR15-9, IGHV1/OR21-1, IGHV3-30-5, IGHV3-38, IGHV3-38-3, IGHV3-43D, IGHV3-64, IGHV3-73, IGHV3-NL1, IGHV3/OR16-6, IGHV4-31, IGHD3-22, IGHD4-23, IGHD5-12, IGHD6-13, IGHJ6 (26 genes) AUC IGHV3-30-3, IGHV3-49, IGHD1-26, IGHV3-30, IGHV3- value ≥ 64, IGHJ6, IGHV3-30-5, IGHD4-23, IGHV3-64D, 0.7 IGHV1-3, IGHV3-23, IGHD6-6, IGHGP, IGHV1/OR15- 9, IGHD3-22, IGHV4-34, IGHV1-8, IGHV4-31, IGHD1/OR15-1a/b, IGHV3-33, IGHV3-73, IGHV3/OR15-7, IGHV5-10-1, IGHV5-51, IGHD6-13, IGHG1, IGHV1/OR15-5, IGHV1/OR21-1, IGHD5-12, IGHV3-43D, IGHV7-81, IGHD2-21, IGHV1-69-2, IGHV3-48, IGHV3/OR16-8, IGHD1-7, IGHG4, IGHD4- 4, IGHD4-11, IGHV3-38, IGHV3-NL1, IGHD4/OR15- 4a/b, IGHD6-25, IGHV3-9, IGHV3-72, IGHV3/OR16- 6, IGHV3/OR16-9, IGHV3/OR16-10, IGHV4-39, IGHD1- 20, IGHD3-3, IGHD4-17, IGHG2, IGHV1-58, IGHV2/OR16-5, IGHV3-38-3, IGHV3-66, IGHV3/OR16- 13, IGHV4-28, IGHD2-15, IGHD3-10, IGHD7-27, IGHV1-45, IGHV1-69, IGHV1-69D, IGHV1/OR15-1, IGHV2-5, IGHV2-26, IGHV3-7, IGHV3-15, IGHV3-16, IGHV3-25, IGHV3-35, IGHV3/OR16-12, IGHV4-4, IGHV4-30-4, IGHV7-4-1, IGHD1-14, IGHD2-2, IGHD2- 8, IGHD3-9, IGHJ1, IGHG3, IGHV1-2, IGHV1-18, IGHV1-24, IGHV1-38-4, IGHV1-46, IGHV2-70, IGHV2- 70D, IGHV3-11, IGHV3-13, IGHV3-20, IGHV3-21, IGHV3-23D, IGHV3-43, IGHV3-53, IGHV3-74, IGHV4- 30-2, IGHV4-38-2, IGHV4-59, IGHV4-61, IGHV4/OR15-8, IGHV6-1, IGHD1-1, IGHD2/OR15- 2a/b, IGHD3-16, IGHD3/OR15-3a/b, IGHD5-5, IGHD5- 18, IGHD5-24, IGHD5/OR15-5a/b, IGHD6-19, IGHJ2, IGHJ3, IGHJ4, IGHJ5 (117 genes)

TABLE 10 Prediction of ME/CFS patients from combination of any two types of IGH genes Variable 1 Variable 2 Number of data AUC IGHGP*1 IGHV3-30-3 60 0.848414 IGHV3-30-3 IGHV3-49 60 0.843713 IGHV3-49 IGHV3-64D 60 0.842538 IGHV1-3 IGHV3-49 60 0.841363 IGHV3-49 IGHD3-22 60 0.840188 IGHV3-30-5 IGHV3-49 60 0.831962 IGHD1-26 IGHD6-6 60 0.829612 IGHV3-49 IGHD1-26 60 0.826087 IGHV3-30-3 IGHD6-6 60 0.824912 IGHV3-38 IGHV3-49 60 0.823737 IGHV3-49 IGHD4-23 60 0.823737 IGHV3-30 IGHV3-49 60 0.819036 IGHV3-30-3 IGHJ6 60 0.815511 IGHV3-49 IGHV3-64 60 0.811986 IGHGP IGHV3-30 60 0.810811 IGHV3-30-3 IGHV3-64D 60 0.810811 IGHV3-49 IGHV3-73 60 0.810811 IGHV1/OR15-9 IGHV3-30-3 60 0.809636 IGHV1/21-1 IGHV3-49 60 0.808461 IGHV3-49 IGHD6-6 60 0.808461 IGHV3-30-3 IGHD5-12 60 0.807286 IGHV3-30-3 IGHD6-13 60 0.807286 IGHV3-49 IGHV4-31 60 0.807286 IGHV3-49 IGHV3/OR16-6 60 0.806111 IGHG4 IGHV3-49 60 0.80376 IGHV1-69-2 IGHV3-49 60 0.80376 IGHGP IGHV3-49 60 0.802585 IGHV3-30-3 IGHV3-43D 60 0.802585 IGHV3-49 IGHV3-NL1 60 0.80141 IGHV3-30-3 IGHV3-38-3 60 0.800235 *1underlined genes are genes found to have significance at significance level of P < 0.05 by simple regression analysis

TABLE 11 Combinations of variables exhibiting AUC ≥ 0.7 from binary logistic regression and ROC analysis Gene 1 Gene 2  1- 1 IGHGP IGHV3-30-3 2 IGHV3-30-3 IGHV3-49 3 IGHV3-49 IGHV3-64D 4 IGHV1-3 IGHV3-49 5 IGHV3-49 IGHD3-22 6 IGHV3-30-5 IGHV3-49 7 IGHD1-26 IGHD6-6 8 IGHV3-49 IGHD1-26 9 IGHV3-30-3 IGHD6-6 10 IGHV3-38 IGHV3-49 11 IGHV3-49 IGHD4-23 12 IGHV3-30 IGHV3-49 13 IGHV3-30-3 IGHJ6 14 IGHV3-49 IGHV3-64 15 IGHGP IGHV3-30 16 IGHV3-30-3 IGHV3-64D 17 IGHV3-49 IGHV3-73 18 IGHV1/OR15-9 IGHV3-30-3 19 IGHV1/OR21-1 IGHV3-49 20 IGHV3-49 IGHD6-6 21 IGHV3-30-3 IGHD5-12 22 IGHV3-30-3 IGHD6-13 23 IGHV3-49 IGHV4-31 24 IGHV3-49 IGHV3/OR16-6 25 IGHG4 IGHV3-49 26 IGHV1-69-2 IGHV3-49 27 IGHGP IGHV3-49 28 IGHV3-30-3 IGHV3-43D 29 IGHV3-49 IGHV3-NL1 30 IGHV3-30-3 IGHV3-38-3 31 IGHV1-69-2 IGHV3-30-3 32 IGHV3-30-3 IGHD4/OR15-4a/b 33 IGHV3-20 IGHV3-49 34 IGHV3-49 IGHD4-11 35 IGHG4 IGHV3-30-3 36 IGHV1/OR15-9 IGHD1-26 37 IGHV3-49 IGHV4-30-4 38 IGHV3-30-3 IGHV4-39 39 IGHV3-49 IGHV4-30-2 40 IGHV2-5 IGHV3-30-3 41 IGHV3-49 IGHD4-4 42 IGHV1/OR15-5 IGHV3-49 43 IGHV3-23 IGHV3-49 44 IGHV3-30-3 IGHD4-4 45 IGHV1/OR21-1 IGHV3-30-3 46 IGHV3-30-3 IGHD1/OR15-1a/b 47 IGHV3-30-3 IGHD6-25 48 IGHV3-48 IGHV3-49 49 IGHV3-30-3 IGHD4-11 50 IGHV3-35 IGHV3-49 51 IGHV3-49 IGHV3/OR16-9 52 IGHV3-49 IGHD3-16 53 IGHV1/OR15-9 IGHV3-49 54 IGHV3-30-3 IGHD2-21 55 IGHV3-33 IGHV3-49 56 IGHV3-49 IGHV5-10-1 57 IGHV3-23D IGHV3-30-3 58 IGHV3-30-3 IGHV7-81 59 IGHV3-64D IGHD1-26 60 IGHV3-30-3 IGHD4-17 61 IGHV3-49 IGHV3/OR16-12 62 IGHV3-49 IGHJ6 63 IGHV3-30-3 IGHD3-22 64 IGHV3-49 IGHD4/OR15-4a/b 65 IGHV3-49 IGHD5-5 66 IGHV3-73 IGHD1-26 67 IGHGP IGHV3-30-5 68 IGHV2/OR16-5 IGHV3-49 69 IGHV3-49 IGHD6-13 70 IGHV2-70 IGHV3-30-3 71 IGHV3-15 IGHV3-49 72 IGHV3-30-3 IGHD2-15 73 IGHV3-49 IGHD1/OR15-1a/b 74 IGHV3-49 IGHD3-3 75 IGHV3-49 IGHD5-18 76 IGHV3-30-3 IGHV5-10-1 77 IGHV3-49 IGHV4-34 78 IGHV3-49 IGHJ2 79 IGHV3-25 IGHV3-49 80 IGHV3-30-3 IGHD3-3 81 IGHV3-30-3 IGHJ3 82 IGHV3-49 IGHV3/OR16-8 83 IGHV3-49 IGHV4-4 84 IGHV3-49 IGHD1-7 85 IGHV3-49 IGHD1-20 86 IGHD1-26 IGHD1/OR15-1a/b 87 IGHGP IGHV3-64 88 IGHV3-30-3 IGHV4-28 89 IGHV3-49 IGHV3-66 90 IGHV3-49 IGHV3/OR16-13 91 IGHV3-49 IGHD5-24 92 IGHV3-9 IGHV3-49 93 IGHV3-11 IGHV3-49 94 IGHV3-23D IGHV3-49 95 IGHV3-49 IGHV4-39 96 IGHV2-5 IGHV3-49 97 IGHV3-30-3 IGHD1-26 98 IGHG1 IGHV3-30-3 99 IGHV1-2 IGHV3-49 100 IGHV1-8 IGHV3-30-3 101- 101 IGHV1/OR15-1 IGHV3-49 102 IGHV1/OR15-5 IGHV3-30-3 103 IGHV3-30-3 IGHD3-10 104 IGHV3-49 IGHV4-59 105 IGHGP IGHD6-6 106 IGHV3-30-3 IGHV4-61 107 IGHV3-30-3 IGHD3-16 108 IGHV3-30-3 IGHJ5 109 IGHV3-49 IGHV3-53 110 IGHV3-49 IGHD5-12 111 IGHV2-26 IGHV3-30-3 112 IGHV3-30-3 IGHD1-14 113 IGHV3-49 IGHD2-2 114 IGHD1-7 IGHD1-26 115 IGHV3-30-3 IGHV3-38 116 IGHV3-49 IGHD2-15 117 IGHG3 IGHV3-30-3 118 IGHV1-2 IGHV3-30-3 119 IGHV1-3 IGHD1-26 120 IGHV1-38-4 IGHV3-49 121 IGHV1/OR15-5 IGHD1-26 122 IGHV2/OR16-5 IGHV3-30-3 123 IGHV3-30-3 IGHV3/OR16-6 124 IGHV3-30-3 IGHV3/OR16-8 125 IGHV3-30-3 IGHD1-7 126 IGHV3-30-3 IGHD2-8 127 IGHV3-43 IGHV3-49 128 IGHV3-49 IGHV3/OR15-7 129 IGHV3-49 IGHV4-61 130 IGHV3-49 IGHV7-4-1 131 IGHV3-49 IGHV7-81 132 IGHV1-45 IGHV3-30-3 133 IGHV1/OR21-1 IGHD1-26 134 IGHV3-7 IGHV3-49 135 IGHV3-30-3 IGHV3-73 136 IGHV3-30-3 IGHV4-31 137 IGHV3-49 IGHV3-74 138 IGHV3-49 IGHV5-51 139 IGHV1-69 IGHV3-30-3 140 IGHV3-23 IGHV3-33 141 IGHV3-43D IGHV3-49 142 IGHV3-49 IGHD2-21 143 IGHV3-49 IGHD4-17 144 IGHV3-49 IGHD6-19 145 IGHV3-49 IGHD6-25 146 IGHV3-49 IGHJ3 147 IGHV3-30-5 IGHD5-12 148 IGHV3-49 IGHV4-38-2 149 IGHV3-49 IGHD2-8 150 IGHV3-30-3 IGHD1-1 151 IGHV3-30-3 IGHJ4 152 IGHG1 IGHV3-49 153 IGHV1-69 IGHV3-49 154 IGHV2-70D IGHV3-49 155 IGHV3-11 IGHV3-30-3 156 IGHV3-13 IGHV3-49 157 IGHV3-25 IGHV3-30-3 158 IGHV3-30-3 IGHV3-66 159 IGHV3-30-3 IGHV4/OR15-8 160 IGHV3-30-3 IGHV7-4-1 161 IGHV3-49 IGHD7-27 162 IGHV3-49 IGHJ4 163 IGHGP IGHV3-23 164 IGHV2-26 IGHV3-49 165 IGHV3-16 IGHV3-49 166 IGHV3-30-3 IGHV3/OR16-13 167 IGHV3-30-3 IGHV4-30-4 168 IGHV3-49 IGHD3-10 169 IGHG2 IGHV3-49 170 IGHV1-8 IGHV3-49 171 IGHV1-24 IGHV3-49 172 IGHV3-7 IGHV3-30-3 173 IGHV3-30-3 IGHV3-43 174 IGHV3-30-3 IGHD5/OR15-5a/b 175 IGHV3-30-3 IGHD7-27 176 IGHV3-49 IGHV3-72 177 IGHV3-49 IGHV4/OR15-8 178 IGHV3-49 IGHD3-9 179 IGHV3-23 IGHV3-30-3 180 IGHV1-18 IGHV3-49 181 IGHV1-38-4 IGHV3-30-3 182 IGHV1-46 IGHV3-30-3 183 IGHV1-53 IGHV3-49 184 IGHV1-58 IGHD1-26 185 IGHV1-69D IGHV3-49 186 IGHV1/OR15-9 IGHV3-30-5 187 IGHV3-30 IGHD6-6 188 IGHV3-30-3 IGHV3-72 189 IGHV3-30-3 IGHJ1 190 IGHV3-30-5 IGHD1-26 191 IGHV3-38-3 IGHV3-49 192 IGHV3-43D IGHD1-26 193 IGHV3-49 IGHD1-1 194 IGHV3-49 IGHD2/OR15-2a/b 195 IGHV3/OR15-7 IGHD1-26 196 IGHD1-26 IGHD4-4 197 IGHV1-18 IGHV3-30-3 198 IGHV1-45 IGHV3-49 199 IGHV3-30-3 IGHV3-33 200 IGHV3-30-3 IGHV3/OR16-10 201- 201 IGHV3-30-3 IGHV3/OR16-12 202 IGHV3-30-3 IGHV4-30-2 203 IGHV3-30-3 IGHD1-20 204 IGHV3-30-3 IGHD2/OR15-2a/b 205 IGHV3-30-3 IGHD5-5 206 IGHV3-30-3 IGHD6-19 207 IGHV3-30-5 IGHD6-6 208 IGHV3-49 IGHV6-1 209 IGHV3-49 IGHD3/OR15-3a/b 210 IGHV3-49 IGHJ1 211 IGHD1-26 IGHD4-11 212 IGHV2-70D IGHV3-30-3 213 IGHV3-30-3 IGHD3-9 214 IGHG3 IGHV3-49 215 IGHV1-69-2 IGHD1-26 216 IGHV2-70 IGHV3-49 217 IGHV3-16 IGHV3-30-3 218 IGHV3-20 IGHV3-30-3 219 IGHV3-30 IGHV3-64D 220 IGHV3-30-3 IGHV3/OR16-9 221 IGHV3-30-3 IGHV4-34 222 IGHD1-26 IGHD4/OR15-4a/b 223 IGHGP IGHV4-34 224 IGHV1-3 IGHV3-30-3 225 IGHV1-46 IGHV3-49 226 IGHV3-21 IGHV3-49 227 IGHV3-23D IGHD1-26 228 IGHV3-30-3 IGHV3-NL1 229 IGHV3-30-3 IGHV3/OR15-7 230 IGHV3-30-3 IGHV4-38-2 231 IGHV3-30-3 IGHV6-1 232 IGHV3-30-3 IGHD2-2 233 IGHV3-30-3 IGHD4-23 234 IGHV3-30-3 IGHD5-24 235 IGHV3-49 IGHV3/OR16-10 236 IGHV7-81 IGHD1-26 237 IGHD1-26 IGHD5-12 238 IGHG1 IGHV3-23 239 IGHV3-13 IGHV3-30-3 240 IGHV3-30 IGHV3-30-3 241 IGHV3-30-3 IGHV3-53 242 IGHV3-30-3 IGHV4-4 243 IGHV3-30-3 IGHV4-59 244 IGHV3-30-3 IGHJ2 245 IGHV3-49 IGHJ5 246 IGHD1-26 IGHD6-25 247 IGHD1-26 IGHJ6 248 IGHV3-30-3 IGHV3-48 249 IGHV3-49 IGHV4-28 250 IGHV3-49 IGHD1-14 251 IGHG2 IGHV3-30-3 252 IGHV1-24 IGHV3-30-3 253 IGHV1-58 IGHV3-30-3 254 IGHV1-69D IGHV3-30-3 255 IGHV3-30 IGHJ6 256 IGHV3-30-3 IGHV3-30-5 257 IGHV3-30-3 IGHV5-51 258 IGHV3-30-3 IGHD5-18 259 IGHV3-49 IGHD5/OR15-5a/b 260 IGHV3-64 IGHD6-6 261 IGHD1-26 IGHD4-17 262 IGHG4 IGHD1-26 263 IGHGP IGHV1-3 264 IGHGP IGHD1-20 265 IGHV3-30 IGHD1-26 266 IGHV3-30-3 IGHV3-35 267 IGHV3-30-3 IGHD3/OR15-3a/b 268 IGHV3-64 IGHV3-64D 269 IGHV3-64D IGHJ6 270 IGHV3-30-3 IGHV3-64 271 IGHV3-30-3 IGHV3-74 272 IGHD1-26 IGHD4-23 273 IGHV3-64 IGHV3/OR16-8 274 IGHV4-31 IGHD1-26 275 IGHV1/OR15-1 IGHV3-30-3 276 IGHV3-9 IGHV3-30-3 277 IGHV3-21 IGHV3-30-3 278 IGHV3-7 IGHD1-26 279 IGHV3-15 IGHV3-30-3 280 IGHV4-61 IGHD1-26 281 IGHD1-26 IGHD3-22 282 IGHV3-23 IGHD6-13 283 IGHV3-64D IGHD3-22 284 IGHD1-26 IGHD7-27 285 IGHV1-3 IGHJ6 286 IGHV1-8 IGHD4-23 287 IGHV3-30-5 IGHJ6 288 IGHV3-64 IGHD1-26 289 IGHV3-64 IGHD4/OR15-4a/b 290 IGHV4-39 IGHD1-26 291 IGHD4-23 IGHJ6 292 IGHV1-8 IGHD1-26 293 IGHV3-30 IGHD3-22 294 IGHV3-73 IGHJ6 295 IGHG1 IGHJ6 296 IGHV1-69 IGHD1-26 297 IGHV1/OR15-9 IGHV3-30 298 IGHV1/OR21-1 IGHV3-30-5 299 IGHV3-30-5 IGHV3-43D 300 IGHV1-3 IGHD6-13 301- 301 IGHV1/OR21-1 IGHD4-23 302 IGHV3-15 IGHD1-26 303 IGHV3-64D IGHV3/OR16-8 304 IGHV5-10-1 IGHD1-26 305 IGHV5-51 IGHD1-26 306 IGHD6-13 IGHJ6 307 IGHV3-30 IGHV7-81 308 IGHD1-26 IGHD3-9 309 IGHD1-26 IGHD3-16 310 IGHV3-16 IGHD1-26 311 IGHV3-48 IGHJ6 312 IGHV3/OR16-10 IGHD1-26 313 IGHV4-34 IGHD1-26 314 IGHD1-26 IGHD3-3 315 IGHD1-26 IGHJ3 316 IGHGP IGHD1-26 317 IGHV3-21 IGHD1-26 318 IGHV3-30-5 IGHV3/OR16-8 319 IGHV3/OR16-8 IGHD1-26 320 IGHV3-73 IGHD3-22 321 IGHV3/OR16-6 IGHD1-26 322 IGHD1-26 IGHJ2 323 IGHV1/OR15-1 IGHD1-26 324 IGHV3-13 IGHD1-26 325 IGHV3-64D IGHV4-31 326 IGHV3-64D IGHV5-51 327 IGHV4/OR15-8 IGHD1-26 328 IGHG3 IGHD1-26 329 IGHG2 IGHV3-23 330 IGHG2 IGHD1-26 331 IGHV3-53 IGHD1-26 332 IGHD1-26 IGHD2-21 333 IGHD1-26 IGHD5-24 334 IGHV1-3 IGHV3-64D 335 IGHV1-24 IGHD1-26 336 IGHV1-46 IGHD1-26 337 IGHV3-48 IGHD1-26 338 IGHV3-64 IGHV7-81 339 IGHV3-30 IGHD1/OR15-1a/b 340 IGHV3-30 IGHD5-12 341 IGHV3-30-5 IGHD1/OR15-1a/b 342 IGHV3-38-3 IGHD1-26 343 IGHV2/OR16-5 IGHD1-26 344 IGHV7-4-1 IGHD1-26 345 IGHG1 IGHV3-30 346 IGHV1-2 IGHD1-26 347 IGHV2/OR16-5 IGHV3-30 348 IGHV3-23 IGHD1-26 349 IGHV3-30 IGHV5-10-1 350 IGHV3-30 IGHD3-3 351 IGHV4-28 IGHD1-26 352 IGHD1-26 IGHD2-15 353 IGHGP IGHD3-22 354 IGHV1-3 IGHD4/OR15-4a/b 355 IGHV3-9 IGHD6-6 356 IGHV3/OR16-12 IGHD1-26 357 IGHV4-4 IGHD1-26 358 IGHV4-30-4 IGHD1-26 359 IGHD1-26 IGHD5-5 360 IGHD1-26 IGHJ5 361 IGHG1 IGHD3-22 362 IGHG4 IGHV3-30 363 IGHV1-69-2 IGHV3-30 364 IGHV2-70 IGHD1-26 365 IGHV2-70D IGHD1-26 366 IGHV3-30 IGHV4-31 367 IGHV3-30 IGHD2-21 368 IGHV3-64 IGHD3-22 369 IGHV3-64D IGHV4-34 370 IGHV3-66 IGHD1-26 371 IGHV3-72 IGHD1-26 372 IGHV5-51 IGHD4-23 373 IGHD1-26 IGHD2/OR15-2a/b 374 IGHD1-26 IGHD5-18 375 IGHD1-26 IGHD5/OR15-5a/b 376 IGHD1-26 IGHD6-19 377 IGHD1-26 IGHJ1 378 IGHD3-22 IGHJ6 379 IGHV1-3 IGHV7-81 380 IGHV3-23 IGHD3-22 381 IGHV3-43 IGHD1-26 382 IGHV3-73 IGHD3-3 383 IGHD1-20 IGHD1-26 384 IGHD1-26 IGHD2-3 385 IGHD1-26 IGHD6-13 386 IGHV1/OR15-9 IGHV3-64 387 IGHV1/OR21-1 IGHV3-30 388 IGHV3-30 IGHD4-4 389 IGHV3-30 IGHD4-23 390 IGHD4-23 IGHD5-12 391 IGHV3-9 IGHD1-26 392 IGHV4-30-2 IGHD1-26 393 IGHD1-26 IGHD3-10 394 IGHD1-26 IGHD3/OR15-3a/b 395 IGHG1 IGHD1-26 396 IGHV3-25 IGHD1-26 397 IGHV3-30 IGHD4/OR15-4a/b 398 IGHV3-30 IGHD6-13 399 IGHV3-64 IGHD6-25 400 IGHV3-64D IGHV3-73 401- 401 IGHV3/OR16-9 IGHD1-26 402 IGHV3/OR16-13 IGHD1-26 403 IGHD3-22 IGHD6-13 404 IGHV1-3 IGHV3-30 405 IGHV3-11 IGHD1-26 406 IGHV3-23 IGHJ6 407 IGHV3-30 IGHV3-33 408 IGHV3-30 IGHD4-17 409 IGHV3-38 IGHD1-26 410 IGHV5-51 IGHJ6 411 IGHD3-3 IGHJ6 412 IGHD6-6 IGHJ6 413 IGHV1-8 IGHV3-38 414 IGHV1-69D IGHD1-26 415 IGHV3-23 IGHD6-6 416 IGHV3-30 IGHD4-11 417 IGHV3-33 IGHD1-26 418 IGHV3-35 IGHD1-26 419 IGHV4-39 IGHJ6 420 IGHGP IGHV3-73 421 IGHV1-38-4 IGHD1-26 422 IGHV1/OR15-9 IGHV4-31 423 IGHV1/OR15-9 IGHD4-23 424 IGHV3-20 IGHD1-26 425 IGHV3-30 IGHV3-38-3 426 IGHV3-64D IGHD6-13 427 IGHD1-14 IGHD1-26 428 IGHD1-26 IGHJ4 429 IGHV1-8 IGHV3/OR16-6 430 IGHV3-23 IGHV3-48 431 IGHV3-64 IGHJ6 432 IGHV3-NL1 IGHD1-26 433 IGHV1/OR15-5 IGHD4-23 434 IGHV2-26 IGHD1-26 435 IGHV3-30-5 IGHV3-64D 436 IGHV3-33 IGHV3-64D 437 IGHV3-64 IGHD1/OR15-1a/b 438 IGHV3-64 IGHD4-17 439 IGHV3-64D IGHV3/OR16-9 440 IGHV3-64D IGHV7-81 441 IGHV3-73 IGHD1/OR15-1a/b 442 IGHV3-73 IGHD4-23 443 IGHD1-26 IGHD2-2 444 IGHD4-11 IGHD4-23 445 IGHG1 IGHV1-3 446 IGHV1/OR21-1 IGHV4-31 447 IGHV3-23 IGHD5-12 448 IGHV3-30-5 IGHD2-21 449 IGHD3-22 IGHD4-23 450 IGHGP IGHJ6 451 IGHD4-17 IGHJ6 452 IGHV2-5 IGHD1-26 453 IGHV3-30-5 IGHD4-11 454 IGHV3-64 IGHD6-13 455 IGHV3/OR15-7 IGHD4-4 456 IGHV3/OR16-10 IGHD1-7 457 IGHV4-4 IGHJ6 458 IGHD1-1 IGHD1-26 459 IGHD3-22 IGHD6-6 460 IGHV1-18 IGHD1-26 461 IGHV1-45 IGHD1-26 462 IGHV1-69-2 IGHJ6 463 IGHV1/OR15-5 IGHV4-34 464 IGHV1/OR15-9 IGHV3-NL1 465 IGHV1/OR21-1 IGHJ6 466 IGHV3-64 IGHV3/OR16-6 467 IGHV3-72 IGHV3/OR15-7 468 IGHV3-74 IGHD1-26 469 IGHV4-34 IGHD6-25 470 IGHV4-59 IGHD1-26 471 IGHV6-1 IGHD1-26 472 IGHD5-12 IGHD6-6 473 IGHV3-64 IGHD2-8 474 IGHG4 IGHV3-30-5 475 IGHGP IGHV3-64D 476 IGHGP IGHV4-31 477 IGHV1-8 IGHV3-30-5 478 IGHV1/OR15-5 IGHV3-30-5 479 IGHV3-64 IGHD5-12 480 IGHV3/OR16-10 IGHD1/OR15-1a/b 481 IGHD2-21 IGHJ6 482 IGHD4-23 IGHD6-6 483 IGHG4 IGHD4-23 484 IGHG4 IGHD6-6 485 IGHGP IGHV5-51 486 IGHGP IGHD1-7 487 IGHV1-3 IGHD5-12 488 IGHV1-8 IGHV3-NL1 489 IGHV1/OR15-9 IGHJ6 490 IGHV1/OR21-1 IGHD2-21 491 IGHV3-23 IGHV3-30-5 492 IGHV3-30 IGHV3-43D 493 IGHV3-48 IGHV3-64D 494 IGHV4-34 IGHD1/OR15-1a/b 495 IGHV1-3 IGHV1/OR21-1 496 IGHV1-69-2 IGHV3-30-5 497 IGHV3-30 IGHV3/OR16-8 498 IGHV3/OR15-7 IGHD6-6 499 IGHV4-38-2 IGHD1-26 500 IGHGP IGHV3-48 501- 501 IGHV1-8 IGHV3-64D 502 IGHV1/OR15-5 IGHV4-31 503 IGHV3-9 IGHJ6 504 IGHV3-23 IGHV3-43D 505 IGHV3-30 IGHV4-34 506 IGHV3-64 IGHD4-4 507 IGHV3-64 IGHD4-23 508 IGHV3-64 IGHD7-27 509 IGHV3/OR15-7 IGHD4-11 510 IGHV5-51 IGHD6-6 511 IGHV1-3 IGHV3-30-5 512 IGHV1-3 IGHV5-51 513 IGHV1-69 IGHD4-23 514 IGHV1-69-2 IGHV5-51 515 IGHV1-69D IGHV3-23 516 IGHV3-23 IGHV3-30 517 IGHV3-30-5 IGHV3-73 518 IGHV3-64 IGHV3-66 519 IGHV3-64 IGHD3-10 520 IGHV3/OR15-7 IGHD1/OR15-1a/b 521 IGHV4-31 IGHV5-10-1 522 IGHV4-31 IGHD4-23 523 IGHV4-34 IGHD1-20 524 IGHV2-5 IGHV3-30 525 IGHV5-10-1 IGHD6-6 526 IGHV1-69-2 IGHV3-64 527 IGHV1/OR15-5 IGHJ6 528 IGHV3-30 IGHD7-27 529 IGHV3-38 IGHJ6 530 IGHV3-64 IGHV4-34 531 IGHV3-23 IGHV4-39 532 IGHV3-23 IGHD4-23 533 IGHV3-30 IGHV3/OR16-6 534 IGHV3-30-5 IGHD4-4 535 IGHV3-33 IGHD4-23 536 IGHV3-43D IGHJ6 537 IGHV3-64 IGHD4-11 538 IGHV3-64D IGHD6-6 539 IGHV3-NL1 IGHJ6 540 IGHV3/OR15-7 IGHV3/OR16-10 541 IGHV4-31 IGHD6-6 542 IGHD1/OR15-1a/b IGHD4-23 543 IGHD4-4 IGHD4-23 544 IGHV1-3 IGHV1/OR15-9 545 IGHV1-58 IGHJ6 546 IGHV1/OR15-9 IGHV3-23 547 IGHV3-16 IGHD1/OR15-1a/b 548 IGHV3-30 IGHD1-7 549 IGHV3-64 IGHD1-20 550 IGHV3-64D IGHV4-39 551 IGHV3-NL1 IGHD6-6 552 IGHV3/OR16-13 IGHJ6 553 IGHG1 IGHV3-64D 554 IGHV1-8 IGHV3-30 555 IGHV3-30-5 IGHV5-10-1 556 IGHV3-72 IGHD4-23 557 IGHV3/OR15-7 IGHD1-7 558 IGHV1-3 IGHV1/OR15-5 559 IGHV1-3 IGHV3-64 560 IGHV1-3 IGHD3-10 561 IGHV1/OR15-9 IGHV4-34 562 IGHV3-9 IGHV3-30 563 IGHV3-15 IGHJ6 564 IGHV3-23 IGHV3/OR16-8 565 IGHV3-30 IGHV3-66 566 IGHV3-30-5 IGHV3-38-3 567 IGHV3-30-5 IGHD1-7 568 IGHV3/OR15-7 IGHV5-10-1 569 IGHV1-3 IGHV3-43D 570 IGHV1-3 IGHD4-23 571 IGHV1/OR15-1 IGHV3-64 572 IGHV1/OR15-5 IGHV3-30 573 IGHV3-25 IGHV3-30 574 IGHV3-43D IGHV3-64 575 IGHV3-64 IGHV3/OR15-7 576 IGHV3-64 IGHV5-10-1 577 IGHV3-64 IGHD2-15 578 IGHV1-3 IGHV3-48 579 IGHV3-30 IGHV3-35 580 IGHV3-30 IGHV3/OR16-13 581 IGHV3-30 IGHV5-51 582 IGHV3-38 IGHV3-64 583 IGHV3-64 IGHD1-7 584 IGHV3-73 IGHD6-6 585 IGHV4-28 IGHD4-23 586 IGHV4-34 IGHV5-10-1 587 IGHD4/OR15-4a/b IGHJ6 588 IGHG1 IGHV3-64 589 IGHG2 IGHV3-30 590 IGHGP IGHV1/OR15-9 591 IGHV1-3 IGHV2/OR16-5 592 IGHV1/OR15-9 IGHD2-21 593 IGHV1/OR21-1 IGHV3-64 594 IGHV2-26 IGHV3-64 595 IGHV3-30 IGHV3-30-5 596 IGHV3-30-5 IGHV3-64 597 IGHV3-33 IGHD1/OR15-1a/b 598 IGHV3-64 IGHD2-2 599 IGHV3-64D IGHD2-21 600 IGHV4-34 IGHV7-81 600- 600 IGHV7-81 IGHJ6 601 IGHD5-12 IGHD6-25 602 IGHV3-64 IGHV7-4-1 603 IGHG4 IGHV3-23 604 IGHV1-69-2 IGHD4-23 605 IGHV3-23 IGHD2-15 606 IGHV3-30 IGHV4-28 607 IGHV1-3 IGHV3/OR16-8 608 IGHV1-8 IGHV3/OR16-9 609 IGHV1-8 IGHD3-22 610 IGHV3-23 IGHV3-64D 611 IGHV3-30 IGHV3-38 612 IGHV3-30 IGHD1-14 613 IGHV3-33 IGHD6-6 614 IGHV3-48 IGHV3-64 615 IGHV3-64 IGHJ1 616 IGHD3-9 IGHD4-23 617 IGHD6-25 IGHJ6 618 IGHG1 IGHD6-6 619 IGHGP IGHV3-33 620 IGHV1-3 IGHV5-10-1 621 IGHV1-8 IGHV5-10-1 622 IGHV1-45 IGHV3-30 623 IGHV1-58 IGHV3-30 624 IGHV1/OR15-5 IGHD2-21 625 IGHV1/OR15-9 IGHD6-13 626 IGHV3-7 IGHV3-64D 627 IGHV3-23 IGHV3-64 628 IGHV3-30 IGHV3-72 629 IGHV3-30 IGHV3/OR16-9 630 IGHV3-30 IGHV4-30-4 631 IGHV3-30-5 IGHV3/OR16-12 632 IGHV3-30-5 IGHV5-51 633 IGHV3-30-5 IGHV7-81 634 IGHV3-33 IGHD6-13 635 IGHV3-33 IGHJ6 636 IGHV3-43D IGHV4-31

Example 5

Multivariate logistic regression analysis using a plurality of IGH genes

1. Method

Multivariate logistic regression analysis was performed for every combination of any two types of IGH genes from 74 types of IGHV, 32 types of IGHD, 6 types of IGHJ, and 5 types of IGHC for a total of 117 types of IGH genes. ROC curves were created using a prediction value for dependent variables of the regression formula that was obtained to compute an AUC value. Simple regression and logistic regression analysis utilized glm( ) of R, and ROC analysis utilized pROC of the R package.

2. Results

30 sets (0.44%) out of the total of 6786 sets of combinations exhibited an AUC value of 0.8 or greater in multivariate logistic regression analysis and ROC analysis using any two variables from 117 types of IGH (Table 12). 26 IGH genes were used in the combinations (Table 13). When any three variables were used, 4469 sets (1.7%) out of 260130 sets exhibited an AUC value of 0.8 or greater, 349 sets (0.13%) exhibited an AUC value of 0.85 or greater, and 4 sets (0.0015%) exhibited an AUC value 0.9 or greater (Table 12). IGH genes used in these combinations were 117 genes (100%), 102 genes (87%), and 6 genes (5.1%), respectively (Table 13). When four variables were used, 3264 sets (0.044%) exhibited an AUC value of 0.9 or greater, 99458 sets (1.4%) exhibited an AUC value of 0.85 or greater, and 489529 sets (6.6%) exhibited an AUC value of 0.8 or greater. When four variables were used, at least one of all 117 genes was used in combinations exhibiting an AUC of 0.8 or greater. IGH genes that can be utilized in prediction and differentiation of ME/CFS are shown in Table 14. Combinations of genes with an AUC value of 0.9 or greater that are understood to exhibit high predictive performance are shown in Tables 15 and 16.

TABLE 12 Number of combinations and frequencies of IGH genes exhibiting a high AUC value Used Number of combinations (%) vari- Total AUC ≥ AUC ≥ AUC ≥ ables number 0.9 0.85 0.8 2 6,786 0 (0) 0 (0) 30 (0.44) 3 260,130 4 (0.1) 349 (0.13) 4,469 (1.72) 4 7,413,705 3,264 (0.044) 99,458 (1.4) 489,529 (6.6)

TABLE 13 Number and frequency of IGH genes that can be used in a combination exhibiting a high AUC value Used Number of combinations (%) vari- Total AUC ≥ AUC ≥ AUC ≥ ables number 0.9 0.85 0.8 2 117 0 (0) 0 (0)   26 (22.2) 3 117   6 (5.1)  102 (87.2) 117 (100) 4 117 117 (100) 117 (100) 117 (100)

TABLE 14 IGH genes that can be utilized in prediction and differentiation of ME/CFS Variables AUC List of genes 2 0.8 or IGHV3-49, IGHV3-30-3, IGHGP, IGHD6-6, greater IGHV3-30, IGHV3-64D, IGHD1-26, IGHG4, IGHV1-3, IGHV1-69-2, IGHV1/OR15-9, IGHV1/OR21-1, IGHV3-30-5, IGHV3-38, IGHV3-38-3, IGHV3-43D, IGHV3-64, IGHV3- 73, IGHV3-NL1, IGHV3/OR16-6, IGHV4-31, IGHD3-22, IGHD4-23, IGHD5-12, IGHD6-13, IGHJ6 (26 genes) 3 0.9 or IGHGP, IGHV3-30-3, IGHV3-49, IGHV3- greater 30, IGHD6-6, IGHV3-64D (4 genes) 3 0.85 or IGHV3-49, IGHV3-30-3, IGHGP, IGHV1-3, greater IGHV3-64D, IGHD3-22, IGHD6-6, IGHV3-30- 5, IGHD1-26, IGHV3-30, IGHD4-23, IGHV3- 38, IGHV3/OR16-6, IGHJ6, IGHV1/OR21-1, IGHV3-73, IGHD4-4, IGHV3-33, IGHV3-64, IGHD4-11, IGHG4, IGHV1-69-2, IGHV3/OR16-8, IGHG1, IGHD5-12, IGHV3- NL1, IGHV4-30-4, IGHV4-34, IGHV1/OR15- 5, IGHV1/OR15-9, IGHV2/OR16-5, IGHV3- 43D, IGHV3-48, IGHV3/OR16-9, IGHD1/OR15-1a/b, IGHD3-3, IGHD4/OR15- 4a/b, IGHV3-15, IGHV3-20, IGHV3-25, IGHD2-2, IGHV3/OR16-13, IGHV4-31, IGHV5-51, IGHD5-18, IGHD5-5, IGHD6-13, IGHV7-81, IGHV3/OR15-7, IGHV3-38-3, IGHV4-39, IGHD1-20, IGHD2-15, IGHJ5, IGHV4-28, IGHV5-10-1, IGHV1-18, IGHV1- 58, IGHV1-69D, IGHV1-8, IGHV2-26, IGHV3-21, IGHV3-23D, IGHV3-7, IGHV3-23, IGHV4-30-2, IGHD1-7, IGHD3/OR15-3a/b, IGHD6-25, IGHV3-72, IGHV4-61, IGHG2, IGHG3, IGHV1-2, IGHV1/OR15-1, IGHV3-35, IGHV3-9, IGHD3-16, IGHV1-24, IGHV1-69, IGHV2-5, IGHV2-70, IGHV3-13, IGHV3-16, IGHD1-1, IGHD2-21, IGHD2-8, IGHD3-10, IGHD4-17, IGHD5-24, IGHD5/OR15-5a/b, IGHD6-19, IGHD7-27, IGHJ2, IGHJ3, IGHJ4, IGHV3-43, IGHV3/OR16-12, IGHV4- 4, IGHV4-59, IGHV4/OR15-8, IGHV7-4-1 (102 genes) 3 0.8 or IGHV3-49, IGHV3-30-3, IGHGP, IGHD1-26, greater IGHD6-6, IGHV3-64D, IGHV3-30, IGHV1/OR15-9, IGHJ6, IGHV3-30-5, IGHV1- 69-2, IGHV3-73, IGHV1/OR21-1, IGHG4, IGHD3-22, IGHV1-3, IGHD6-13, IGHD4/OR15-4a/b, IGHD5-12, IGHV3-64, IGHD4-23, IGHV3-38, IGHV4-31, IGHD4-11, IGHD4-4, IGHV3/OR16-6, IGHV3-43D, IGHD1/OR15-1a/b, IGHV3-NL1, IGHV4-30-4, IGHV1/OR15-5, IGHV3-38-3, IGHV3-23, IGHV3-20, IGHV3/OR16-8 IGHV5-10-1, IGHD1-7, IGHV3-33, IGHV4- 39, IGHV4-30-2, IGHV3-48, IGHV2-5, IGHV7-81, IGHV3-23D, IGHD3-3, IGHD1-20, IGHG1, IGHD2-21, IGHV3/OR16-13, IGHV3- 15, IGHV3/OR16-9, IGHV4-34, IGHV2/OR16- 5, IGHD3-16, IGHD4-17, IGHV3-25, IGHD5- 5, IGHV4-28, IGHD5-18, IGHD3-10, IGHD6- 25, IGHD2-2, IGHV1-8, IGHD2-15, IGHV3- 35, IGHV1/OR15-1, IGHV3/OR16-12, IGHV2- 70, IGHV3-11, IGHV1-2, IGHD1-14, IGHV4- 61, IGHV3/OR15-7, IGHJ3, IGHG3, IGHV3- 66, IGHV3-9, IGHV5-51, IGHD2-8, IGHV4- 4, IGHJ5, IGHV1-69D, IGHV2-26, IGHG2, IGHV1-58, IGHV3-43, IGHV7-4-1, IGHD1-1, IGHV3-7, IGHV4-59, IGHD7-27, IGHV3-13, IGHV3-16, IGHV3-21, IGHJ2, IGHV1-38-4, IGHV3-72, IGHV4-38-2, IGHV4/OR15-8, IGHD5-24, IGHJ4, IGHV1-18, IGHV1-69, IGHD2/OR15-2a/b, IGHD3-9, IGHD6-19, IGHV1-45, IGHV3/OR16-10, IGHV2-70D, IGHV3-53, IGHV1-24, IGHV1-46, IGHV3-74, IGHD5/OR15-5a/b, IGHV6-1, IGHD3/OR15- 3a/b, IGHJ1 (117 genes) 4 0.9 or The 117 genes described above greater

TABLE 15 Combinations of IGH genes exhibiting high predictive performance (AUC value of 0.9 or greater) (3 types) IGH gene 1 IGH gene 2 IGH gene 3 AUC value IGHGP IGHV3-30-3 IGHV3-49 0.925969 IGHGP IGHV3-30-3 IGHD6-6 0.920094 IGHGP IGHV3-30 IGHV3-49 0.902468 IGHGP IGHV3-30-3 IGHV3-64D 0.900118

TABLE 16 Combinations of IGH genes exhibiting high predictive performance (AUC value of 0.9 or greater) (4 types) (up to 100th combination in order of AUC value) IGH gene 1 IGH gene 2 IGH gene 3 IGH gene 4 AUC value IGHGP IGHV3-30-3 IGHV3-49 IGHD6-6 0.963572 IGHGP IGHV3-30-3 IGHD6-6 IGHV3-49 0.963572 IGHV3-30-3 IGHV3-49 IGHD6-6 IGHGP 0.963572 IGHGP IGHV3-49 IGHD6-6 IGHV3-30-3 0.963572 IGHGP IGHV3-30-3 IGHJ6 IGHG1 0.954172 IGHG1 IGHV3-30-3 IGHJ6 IGHGP 0.954172 IGHG1 IGHGP IGHV3-30-3 IGHJ6 0.954172 IGHG1 IGHGP IGHJ6 IGHV3-30-3 0.954172 IGHGP IGHV3-30-3 IGHV3-49 IGHV3-33 0.951821 IGHGP IGHV3-30-3 IGHD6-6 IGHV4-28 0.951821 IGHV3-30-3 IGHV3-33 IGHV3-49 IGHGP 0.951821 IGHGP IGHV3-30-3 IGHV4-28 IGHD6-6 0.951821 IGHGP IGHV3-30-3 IGHV3-33 IGHV3-49 0.951821 IGHV3-30-3 IGHV4-28 IGHD6-6 IGHGP 0.951821 IGHGP IGHV4-28 IGHD6-6 IGHV3-30-3 0.951821 IGHGP IGHV3-33 IGHV3-49 IGHV3-30-3 0.951821 IGHGP IGHV3-30-3 IGHV3-49 IGHV3-25 0.950646 IGHGP IGHV3-30-3 IGHV3-49 IGHD3-22 0.950646 IGHGP IGHV3-30-3 IGHV3-49 IGHJ6 0.950646 IGHGP IGHV3-30-3 IGHJ6 IGHV3-49 0.950646 IGHGP IGHV3-25 IGHV3-30-3 IGHV3-49 0.950646 IGHGP IGHV3-49 IGHD3-22 IGHV3-30-3 0.950646 IGHV3-30-3 IGHV3-49 IGHD3-22 IGHGP 0.950646 IGHGP IGHV3-30-3 IGHD3-22 IGHV3-49 0.950646 IGHV3-30-3 IGHV3-49 IGHJ6 IGHGP 0.950646 IGHV3-25 IGHV3-30-3 IGHV3-49 IGHGP 0.950646 IGHGP IGHV3-25 IGHV3-49 IGHV3-30-3 0.950646 IGHGP IGHV3-49 IGHJ6 IGHV3-30-3 0.950646 IGHGP IGHV3-30-3 IGHV3-49 IGHD1-20 0.949471 IGHGP IGHV3-30-3 IGHD1-20 IGHV3-49 0.949471 IGHGP IGHV3-49 IGHD1-20 IGHV3-30-3 0.949471 IGHV3-30-3 IGHV3-49 IGHD1-20 IGHGP 0.949471 IGHGP IGHV3-30-3 IGHD6-6 IGHV3-64D 0.948296 IGHGP IGHV3-30 IGHV3-49 IGHD6-6 0.948296 IGHGP IGHV3-30-3 IGHV3-64D IGHD6-6 0.948296 IGHGP IGHV3-30 IGHD6-6 IGHV3-49 0.948296 IGHV3-49 IGHV3-73 IGHD3-22 IGHGP 0.948296 IGHGP IGHV3-49 IGHD3-22 IGHV3-73 0.948296 IGHGP IGHV3-49 IGHD6-6 IGHV3-30 0.948296 IGHGP IGHV3-49 IGHV3-73 IGHD3-22 0.948296 IGHV3-30-3 IGHV3-64D IGHD6-6 IGHGP 0.948296 IGHV3-30 IGHV3-49 IGHD6-6 IGHGP 0.948296 IGHGP IGHV3-64D IGHD6-6 IGHV3-30-3 0.948296 IGHGP IGHV3-73 IGHD3-22 IGHV3-49 0.948296 IGHGP IGHV3-30-3 IGHV3-49 IGHV3/OR16-6 0.947121 IGHGP IGHV3-30-3 IGHV3-49 IGHV4-34 0.947121 IGHV3-30-3 IGHV3-49 IGHV3/OR16-6 IGHGP 0.947121 IGHGP IGHV3-30-3 IGHV4-34 IGHV3-49 0.947121 IGHGP IGHV3-49 IGHV4-34 IGHV3-30-3 0.947121 IGHGP IGHV3-30-3 IGHV3/OR16-6 IGHV3-49 0.947121 IGHV3-30-3 IGHV3-49 IGHV4-34 IGHGP 0.947121 IGHGP IGHV3-49 IGHV3/OR16-6 IGHV3-30-3 0.947121 IGHGP IGHV3-30-3 IGHV3-49 IGHD2-2 0.945946 IGHGP IGHV3-30-3 IGHV3-49 IGHD4-11 0.945946 IGHV3-30-3 IGHV3-49 IGHD4-11 IGHGP 0.945946 IGHV3-30-3 IGHV3-49 IGHD2-2 IGHGP 0.945946 IGHGP IGHV3-30-3 IGHD4-11 IGHV3-49 0.945946 IGHGP IGHV3-30-3 IGHD2-2 IGHV3-49 0.945946 IGHGP IGHV3-49 IGHD4-11 IGHV3-30-3 0.945946 IGHGP IGHV3-49 IGHD2-2 IGHV3-30-3 0.945946 IGHGP IGHV3-30-3 IGHV3-49 IGHG4 0.944771 IGHGP IGHV3-30-3 IGHV3-49 IGHV3-64D 0.944771 IGHGP IGHV3-30-3 IGHV3-64D IGHV3-49 0.944771 IGHG4 IGHV3-30-3 IGHV3-49 IGHGP 0.944771 IGHV3-30-3 IGHV3-49 IGHV3-64D IGHGP 0.944771 IGHG4 IGHGP IGHV3-30-3 IGHV3-49 0.944771 IGHGP IGHV3-49 IGHV3-64D IGHV3-30-3 0.944771 IGHG4 IGHGP IGHV3-49 IGHV3-30-3 0.944771 IGHGP IGHV3-30-3 IGHV3-49 IGHV3/OR16-13 0.943596 IGHGP IGHV3-30-5 IGHV3-49 IGHV4-34 0.943596 IGHGP IGHV3-30-3 IGHV3/OR16-13 IGHV3-49 0.943596 IGHV3-30-3 IGHV3-49 IGHV3/OR16-13 IGHGP 0.943596 IGHGP IGHV3-49 IGHV4-34 IGHV3-30-5 0.943596 IGHGP IGHV3-30-5 IGHV4-34 IGHV3-49 0.943596 IGHV3-30-5 IGHV3-49 IGHV4-34 IGHGP 0.943596 IGHGP IGHV3-49 IGHV3/OR16-13 IGHV3-30-3 0.943596 IGHGP IGHV3-30-3 IGHV3-49 IGHV2/OR16-5 0.942421 IGHGP IGHV3-30-3 IGHV3-49 IGHV3/OR16-8 0.942421 IGHGP IGHV3-30-3 IGHV3-49 IGHD4-4 0.942421 IGHGP IGHV3-30-3 IGHV3-49 IGHD5-12 0.942421 IGHV3-30-3 IGHV3-49 IGHD4-4 IGHGP 0.942421 IGHGP IGHV3-30-3 IGHV3/OR16-8 IGHV3-49 0.942421 IGHGP IGHV3-30-3 IGHD5-12 IGHV3-49 0.942421 IGHV3-30-3 IGHV3-49 IGHD5-12 IGHGP 0.942421 IGHGP IGHV3-30-3 IGHD4-4 IGHV3-49 0.942421 IGHGP IGHV2/OR16-5 IGHV3-30-3 IGHV3-49 0.942421 IGHV2/OR16-5 IGHV3-30-3 IGHV3-49 IGHGP 0.942421 IGHV3-30-3 IGHV3-49 IGHV3/OR16-3 IGHGP 0.942421 IGHGP IGHV3-49 IGHV3/OR16-8 IGHV3-30-3 0.942421 IGHGP IGHV3-49 IGHD4-4 IGHV3-30-3 0.942421 IGHGP IGHV2/OR16-5 IGHV3-49 IGHV3-30-3 0.942421 IGHGP IGHV3-30-3 IGHV3-49 IGHV5-51 0.941246 IGHGP IGHV3-30-3 IGHD6-6 IGHD2-2 0.941246 IGHGP IGHV3-30-3 IGHV5-51 IGHV3-49 0.941246 IGHGP IGHV3-30-3 IGHD2-2 IGHD6-6 0.941246 IGHV3-30-3 IGHV3-49 IGHV5-51 IGHGP 0.941246 IGHV3-30-3 IGHD2-2 IGHD6-6 IGHGP 0.941246 IGHGP IGHV3-49 IGHV5-51 IGHV3-30-3 0.941246 IGHGP IGHV3-30-3 IGHV3-49 IGHV3-38-3 0.941246 IGHGP IGHV3-30-3 IGHD6-6 IGHV3-38-3 0.941246

Example 6

Multivariate Logistic Regression Analysis with Combination of Variables

1. Method

Multivariate logistic regression analysis was performed with a combination of 2 variables, 3 variables, and 4 variables using 46 types of variables (35 types of IGH, 8 types of cell subpopulations, and 3 types of diversity indices) satisfying the significance level of p<0.2 by simple regression analysis. A ROC curve was created using prediction values for dependent variables of a regression formula that was obtained to calculate an AUC value. Simple regression and logistic regression analysis utilized glm( ) of R, and ROC analysis utilized pROC of the R package.

2. Results

102 sets (9.9%) exhibited an AUC value of 0.8 or greater and 382 sets (36.9%) exhibited an AUC value of 0.7 or greater and less than 0.8 from multivariate logistic regression analysis and ROC analysis using any two variables from the types of selected variables (Table 17). When three variables were used, 85 sets (0.6%) exhibited an AUC value of 0.9 or greater, 3472 sets (22.9%) exhibited an AUC value of 0.8 or greater and less than 0.9, and 8826 sets (58.1%) exhibited an AUC value of 0.7 or greater and less than 0.8. When 4 variables were used, 5330 sets (3.3%) exhibited an AUC value of 0.9 or greater, 63981 sets (39.2%) exhibited an AUC value of 0.8 or greater and less than 0.9, and 87291 sets (53.5%) exhibited an AUC value of 0.7 or greater and less than 0.8. It is understood that the predictive performance of ME/CFS is improved by combining many variables. Table 18 shows combinations exhibiting an AUC value of 0.8 or greater with 2 variables, and Table 19 shows combinations exhibiting an AUC value of 0.9 or greater with 3 variables. Since combinations with Treg accounted for the majority of the top rankings, it is inferred that predictive performance is enhanced by using Treg as a variable.

TABLE 17 Number of combinations exhibiting high AUC value from 46 types of selected variables Number Total Number of combinations (% ratio) of vari- combi- 0.9 > 0.8 > ables nations AUC ≥ 0.9 AUC ≥ 0.8 AUC ≥ 0.7 2 1035 0 102 (9.9) 382 (36.9) 3 15180 85 (0.6) 3472 (22.9) 8826 (58.1) 4 163185 5330 (3.3) 63981 (39.2) 87291 (53.5)

TABLE 18 Predictive performance of ME/CFS patient with a combination of any two variables from 46 types (AUC ≥ 0.8) Classifi- Number AUC cation*1 Variable 1 Variable 2 of data value IGH/FCM Treg*2 IGHV1-3 43 0.891304 IGH/FCM Treg IGHV3-23 43 0.886957 IGH/FCM Treg IGHGP 43 0.871739 FCM Treg Tfh17 43 0.869565 FCM Treg Tfh2 43 0.867391 IGH/FCM Treg IGHV1/15-9 43 0.867391 IGH/FCM IGHV3-49 Treg 43 0.865217 IGH/FCM Treg IGHV3-30-3 43 0.865217 IGH/FCM Treg IGHV3-23D 43 0.863044 IGH/FCM Treg IGHV3-64D 43 0.86087 IGH/FCM Treg IGHV3-30 43 0.858696 IGH/FCM Treg IGHD3-22 43 0.858696 IGH/FCM Treg IGHV1/15-5 43 0.858696 IGH/FCM B-cell IGHV3-49 60 0.857814 IGH/FCM Treg IGHV3-30-5 43 0.856522 IGH/FCM B-cell IGHV3-30-3 60 0.854289 IGH/FCM Treg IGHV3-64 43 0.852174 IGH/FCM Treg IGHV3-33 43 0.85 IGH/FCM Treg IGHV1-69-2 43 0.85 IGH IGHV3-30-3 IGHGP 60 0.848414 IGH/FCM Treg IGHV4-31 43 0.845652 IGH IGHV3-49 IGHV3-30-3 60 0.843713 IGH/FCM Treg IGHD6-6 43 0.843478 IGH IGHV3-49 IGHV3-64D 60 0.842538 IGH/FCM B-cell IGHD1-26 60 0.841363 IGH IGHV3-49 IGHV1-3 60 0.841363 FCM B-cell Treg 43 0.841304 IGH/FCM B-cell IGHV3-23 60 0.840188 IGH IGHV3-49 IGHD3-22 60 0.840188 IGH/FCM Treg IGHJ6 43 0.83913 IGH/FCM Treg IGHD4/15-4a/b 43 0.83913 IGH/FCM Treg IGHV3-73 43 0.834783 FCM Treg Th17 43 0.834783 IGH/FCM Treg IGHD1-26 43 0.832609 IGH IGHV3-49 IGHV3-30-5 60 0.831962 IGH/FCM IGHV3-49 aN 50 0.830882 IGH IGHD1-26 IGHD6-6 60 0.829612 IGH/FCM B-cell IGHGP 60 0.828437 IGH/FCM Treg IGHG1 43 0.828261 IGH/FCM Treg IGHV5-51 43 0.828261 IGH/FCM Treg IGHV5-10-1 43 0.828261 IGH/FCM Treg IGHG4 43 0.828261 IGH/FCM Treg IGHV3/16-9 43 0.828261 FCM B-cell Tfh2 43 0.826087 IGH IGHV3-49 IGHD1-26 60 0.826087 IGH/FCM Treg IGHV3-48 43 0.826087 IGH/FCM Treg Inverse 43 0.826087 IGH/FCM Treg IGHD4-17 43 0.826087 IGH/FCM Treg IGHV3-21 43 0.826087 IGH/FCM B-cell IGHD3-22 60 0.824912 IGH/FCM B-cell IGHV3-64 60 0.824912 IGH IGHV3-30-3 IGHD6-6 60 0.824912 FCM Treg mB 43 0.823913 FCM Treg nB 43 0.823913 FCM/diversity Treg Shannon 43 0.823913 IGH/FCM Treg IGHV3-43D 43 0.823913 IGH IGHV3-49 IGHD4-23 60 0.823737 IGH/FCM B-cell IGHD4-23 60 0.822562 IGH/FCM IGHV3-30-3 Tfh2 43 0.821739 IGH/FCM B-cell IGHV3-64D 60 0.820212 IGH/FCM Treg IGHD5-12 43 0.819565 FCM/diversity Treg Pielou 43 0.819565 IGH IGHV3-49 IGHV3-30 60 0.819036 IGH/FCM B-cell IGHV3-33 60 0.817861 IGH/FCM B-cell IGHD6-13 60 0.817861 IGH/FCM Treg IGHD1-7 43 0.817391 IGH/FCM B-cell IGHV3-30 60 0.816686 IGH/FCM B-cell IGHV3-30-5 60 0.815511 IGH/FCM B-cell IGHV3-48 60 0.815511 IGH IGHV3-30-3 IGHJ6 60 0.815511 IGH/FCM B-cell IGHV5-10-1 60 0.814336 FCM B-cell Tfh17 43 0.813044 IGH/FCM B-cell IGHV1-69-2 60 0.811986 IGH IGHV3-49 IGHV3-64 60 0.811986 IGH/FCM Treg IGHD4-23 43 0.81087 IGH/FCM IGHV3-30-3 Th17 43 0.81087 IGH IGHV3-49 IGHV3-73 60 0.810811 IGH IGHV3-30-3 IGHV3-64D 60 0.810811 IGH IGHV3-30 IGHGP 60 0.810811 IGH IGHV3-30-3 IGHV1/15-9 60 0.809636 IGH IGHV3-49 IGHD6-6 60 0.808461 IGH IGHV3-49 IGHV4-31 60 0.807286 IGH IGHV3-30-3 IGHD6-13 60 0.807286 IGH IGHV3-30-3 IGHD5-12 60 0.807286 IGH/FCM Treg IGHD6-13 43 0.806522 IGH/FCM Treg IGHV4-39 43 0.806522 IGH/FCM IGHV3-30-3 Tfh17 43 0.806522 IGH/FCM B-cell IGHV4-39 60 0.806111 IGH/FCM B-cell IGHJ6 60 0.804935 IGH/FCM IGHV3-49 mB 60 0.804935 IGH/FCM Tfh17 IGHD6-6 43 0.804348 IGH/FCM IGHV3-49 nB 60 0.80376 IGH IGHV3-49 IGHV1-69-2 60 0.80376 IGH IGHV3-49 IGHG4 60 0.80376 IGH/FCM B-cell IGHD6-6 60 0.802585 IGH IGHV3-49 IGHGP 60 0.802585 IGH IGHV3-30-3 IGHV3-43D 60 0.802585 IGH/FCM IGHV3-30-5 Tfh2 43 0.802174 IGH/FCM B-cell IGHV3-23D 60 0.80141 IGH/FCM B-cell IGHD4/15-4a 60 0.80141 IGH/FCM B-cell IGHG4 60 0.80141 IGH/FCM B-cell IGHD5-12 60 0.800235 *1IGH/FCM: composite of IGH variable and cell subpopulation, IGH: only IGH variable, FCM: only cell subpopulation variable *2Treg is indicated by an underline

TABLE 19 Prediction of ME/CFS patient with combination of any 3 variables from 46 types (AUC ≥ 0.9) Classifi- AUC cation*1 Variable 1 Variable 2 Variable 3 Data value IGH/FCM Treg IGHGP IGHV1-3 43 0.952174 IGH/FCM Treg IGHGP IGHV3-23 43 0.936957 IGH/FCM Treg IGHV3-30-3 IGHGP 43 0.934783 IGH/FCM Treg IGHGP IGHV3-30-5 43 0.932609 IGH/FCM Treg IGHV3-33 IGHV3-23 43 0.932609 IGH/FCM Treg IGHD3-22 IGHV3-23 43 0.930435 IGH/FCM Treg IGHV1-3 IGHV1/OR15-9 43 0.930435 IGH/FCM Treg IGHV3-30 IGHGP 43 0.930435 IGH/FCM Treg IGHV3-30-3 IGHV1/OR15-9 43 0.928261 IGH/FCM IGHV3-49 Treg IGHV1-3 43 0.928261 IGH/FCM Treg IGHV3-23 IGHV1/OR15-9 43 0.926087 IGH IGHV3-49 IGHV3-30-3 IGHGP 60 0.925969 IGH/FCM Treg IGHV1-3 IGHV1/OR15-5 43 0.923913 IGH/FCM Treg IGHV1-3 IGHD4/OR15-4a/b 43 0.923913 IGH/FCM Treg IGHV3-30-3 IGHV3-64D 43 0.921739 IGH IGHV3-30-3 IGHGP IGHD6-6 60 0.920094 IGH/FCM Treg Tfh2 IGHV3-64 43 0.919565 IGH/FCM Treg IGHV3-30-5 IGHV1/OR15-9 43 0.919565 IGH/FCM IGHV3-49 IGHD3-22 aN 50 0.919118 IGH/FCM B-cell IGHV3-49 IGHD3-22 60 0.918919 IGH/FCM B-cell Treg Tfh2 43 0.917391 IGH/FCM Treg IGHD3-22 Tfh17 43 0.917391 IGH/FCM Treg IGHV1-3 IGHV3-64D 43 0.917391 IGH/FCM Treg Tfh17 IGHV1/OR15-9 43 0.917391 IGH/FCM IGHV3-49 Treg IGHD3-22 43 0.917391 IGH/FCM Treg IGHD6-6 IGHV3-23 43 0.915217 IGH/FCM Treg IGHD3-22 IGHV1/OR15-9 43 0.915217 IGH/FCM Treg IGHV3-30-3 IGHJ6 43 0.915217 IGH/FCM B-cell Treg IGHV3-23 43 0.913044 IGH/FCM Treg IGHV3-30 IGHV1/OR15-9 43 0.913044 IGH/FCM Treg IGHV1-3 Tfh2 43 0.91087 IGH/FCM Treg IGHV1-3 Tfh17 43 0.91087 IGH/FCM Treg IGHV3-23 IGHV3-64D 43 0.91087 IGH/FCM Treg IGHV3-30 IGHV3-33 43 0.91087 IGH/FCM Treg IGHV3-23 Tfh2 43 0.908696 IGH/FCM Treg IGHD3-22 IGHV3-64D 43 0.908696 IGH/FCM IGHV3-49 Treg IGHV3-30-3 43 0.908696 IGH/FCM IGHV3-49 Treg IGHV3-23 43 0.908696 IGH/FCM Treg IGHV3-30-3 IGHV3-23 43 0.908696 IGH/FCM Treg IGHV3-30-5 IGHV1/OR15-5 43 0.908696 IGH/FCM Treg IGHV3-23 IGHV1/OR15-5 43 0.908696 IGH/FCM Treg IGHV3-30-3 IGHV1-3 43 0.908696 IGH/FCM Treg IGHV3-64D Tfh2 43 0.906522 IGH/FCM Treg Tfh17 IGHV3-64D 43 0.906522 IGH/FCM Treg IGHV3-23 IGHV3-64 43 0.906522 IGH/FCM Treg IGHS3-22 IGHV3-33 43 0.906522 IGH/FCM Treg IGHV3-30-3 IGHV1/OR15-5 43 0.906522 IGH/FCM B-cell Treg IGHV1-3 43 0.906522 IGH/FCM Treg IGHV3-23 IGHD6-13 43 0.906522 IGH/FCM Treg IGHV3-23 Th17 43 0.904348 IGH/FCM Treg IGHV3-23 IGHV3-48 43 0.904348 IGH IGHV3-49 IGHV3-30 IGHGP 60 0.902468 IGH/FCM IGHV3-49 IGHD3-22 Th17 43 0.902174 IGH/FCM Treg IGHV3-23 IGHV4-31 43 0.902174 IGH/FCM Treg Tfh2 IGHV4-31 43 0.902174 IGH/FCM Treg Tfh2 IGHV3-73 43 0.902174 IGH/FCM Treg IGHGP IGHV3-64D 43 0.902174 IGH/FCM Treg IGHV1-3 IGHV3-48 43 0.902174 IGH/FCM Treg IGHV3-23 IGHV3-43D 43 0.902174 IGH/FCM Treg IGHV3-30-3 IGHV3-33 43 0.902174 IGH/FCM Treg IGHV1-3 IGHV3-23D 43 0.902174 IGH/FCM Treg Tfh17 IGHV3-23 43 0.902174 IGH/FCM Treg IGHB4-23 IGHV3-23 43 0.902174 IGH/FCM Treg IGHV3-23 IGHV3-21 43 0.902174 IGH/FCM Treg IGHV3-23D IGHV1/OR15-9 43 0.902174 IGH/FCM Treg IGHV1-3 IGHV1-69-2 43 0.902174 IGH/FCM Treg Tfh17 IGHV1-69-2 43 0.902174 IGH/FCM Treg IGHV3-23 IGHG1 43 0.902174 IGH/FCM Treg IGHGP IGHD6-6 43 0.902174 IGH IGHV3-30-3 IGHGP IGHV3-64D 60 0.900118 IGH/FCM Treg IGHGP Tfh17 43 0.9 IGH/FCM/ Treg IGHGP Pielou 43 0.9 diversity IGH/FCM Treg IGHV1-3 Inverse 43 0.9 IGH/FCM Treg IGHV1-3 IGHV4-39 43 0.9 IGH/FCM Treg IGHV3-64 IGHV3/OR16-9 43 0.9 IGH/FCM Treg IGHD3-22 IGHV3-64 43 0.9 IGH/FCM IGHV3-49 Treg IGHV3-30-5 43 0.9 IGH/FCM B-cell Treg IGHV3-30-3 43 0.9 IGH/FCM Treg IGHGP IGHV3-23D 43 0.9 IGH/FCM Treg IGHV1-3 IGHV3-21 43 0.9 IGH/FCM Treg IGHV3-23 IGHV1-69-2 43 0.9 IGH/FCM Treg Tfh17 IGHG4 43 0.9 IGH/FCM Treg IGHV1-3 IGHD6-6 43 0.9 IGH/FCM Treg IGHD3-22 IGHD6-6 43 0.9 IGH/FCM Treg IGHV3-23 IGHD4/OR15-4a/b 43 0.9 *1IGH/FCM: composite of IGH variable and cell subpopulation, IGH: only IGH variable, FCM: only cell subpopulation variable *2: Treg is indicated by an underline

While the maximum value of AUC was 0.87 (Treg, Tfh17) for combinations of cell subpopulation variables, an AUC exceeding 0.87 was obtained when combining a cell subpopulation variable with an IGH gene in Treg+IGHV1-3, Treg+IGHV3-23, and Treg+IGHGP. Further, a B cell count, when combined with IGHV3-49, IGHV3-30-3, or IGHD1-26, resulted in a higher AUC value than the maximum AUC=0.84 in a combination with Treg. Higher predictive performance can be attained by using a specific IGH gene in addition to a cell subpopulation variable than performance attained by a combination of cell subpopulation variables.

Example 7

Differentiation of ME/CFS Patient from Patient of Another Disease

1. Method

In addition to the samples from 37 myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) patients in Example 1, whole blood was newly collected from 10 multiple sclerosis (MS) patients. PBMC was separated and RNA was extracted in accordance with the method described in 1. Materials and Methods of Example 1. Subsequently, complementary DNA and double stranded complementary DNA were synthesized, and subjected to PCR, next generation sequence analysis, and then analysis with repertoire analysis software. From the read data of each sample, the IGH gene V region sequence (IGHV), D region sequence (IGHD), J region sequence (IGHJ), and C region sequence (IGHC) were collated, and CDR3 sequences were determined. The number of unique reads for each sample was tallied, and usage frequencies for 74 types of IGHV, 32 types of IGHD, 6 types of IGHJ, and 5 types of IGHC were computed. Significance test was conducted on read data and IGHV, IGHD, IGHJ, and IGHC usage frequencies using Mann-Whitney test between the ME/CFS and MS groups. Simple regression and logistic regression analysis utilized glm of R, and ROC analysis utilized pROC of the R package, by using usage frequency data for ME/CFS patients and MS patients. A Receiver Operating Characteristic (ROC) curve was created using prediction values for dependent variables of a regression formula in ROC analysis. The ROC curve Area Under the Curve (AUC) value was found as a performance evaluation value for the prediction and determination of these variables.

2. Results 2-1. Difference in IGH Gene Usage Frequencies Between ME/CFS Patients and MS Patients

Sequence data for 140 to 280 thousand reads was obtained from, 10 MS patient samples (Table 20). A significant difference was not found between ME/CFS patients and MS patients in the total number of reads, assigned reads, number of in-frame reads, and number of unique reads. As a result, IGH gene usage frequency was significantly higher in MS patients compared to ME/CFS patients in IGHV3-7**, IGHV3-33*, IGHV3-73*, IGHV3-NL1*, IGHV4-28*, IGHV4-39*, IGHD4-17*, IGHD5-5*, and IGHD5-18* (Mann-Whitney test: *P<0.05, p <0.01). Meanwhile, significantly lower usage frequencies of IGHV3-23** and IGHV3-23D* were exhibited in MS patients compared to ME/CFS patients (FIG. 9). Similarly, variables other than IGH gene such as diversity indices (Shannon index, inverse Simpson index, Pielou's index, and DE50 index) were studied as to whether there is a difference between ME/CFS patients and MS patients (FIG. 10). A significant difference was found in any of the diversity indices between ME/CFS patients and MS patients. These results show that diversity indices are not useful in the differentiation of ME/CFS patients from MS patients, but usage frequencies of IGH genes reflect disease specificity and are useful in differentiation.

TABLE 20 Number of reads (mean ± SD) ME/CFS patients MS patients Number of reads (37 cases) (10 cases) P value Total reads  289342.6 ± 112471.3 271898.7 ± 47747.6 0.47 Assigned reads 162337.5 ± 40656.5 148545.3 ± 35911  0.31 In-frame reads  158459 ± 39046.6 145064.7 ± 35436.6 0.32 Unique reads 16936.5 ± 9067.3 14962.8 ± 4643.3 0.35 2-2. Differentiation of ME/CFS Patients from MS Patients Using Multivariate Logistic Analysis

Next, the above 11 types of IGH genes detected to have a significant difference between ME/CFS patients and MS patients were studied as to whether the genes can differentiate ME/CFS patients from MS patients by using multivariate logistic analysis. Logistic regression analysis was performed utilizing usage frequencies of any two types of IGH genes or any three types of IGH genes among the 11 types of IGH genes to compute an AUC value. Combinations with a high AUC value were listed as combinations of IGH genes with excellent predictive performance (Tables 21 and 22).

TABLE 21 Differentiation using combinations of any two variables of IGH genes found to have a significant difference between ME/CFS groups and MS groups (AUC ≥ 0.8) IGH gene 1 IGH gene 2 Number of data AUC value IGHD5-18 IGHV4-39 47 0.845946 IGHD5-5 IGHV4-39 47 0.843243 IGHV3-73 IGHV4-28 47 0.837838 IGHD5-5 IGHV3-23D 47 0.835135 IGHD5-18 IGHV3-23D 47 0.835135 IGHD5-18 IGHV3-7 47 0.824324 IGHD5-5 IGHV3-7 47 0.821622 IGHV3-23D IGHV3-7 47 0.816216 IGHV3-23 IGHV3-7 47 0.813514 IGHV3-7 IGHV4-39 47 0.813514 IGHD4-17 IGHV3-7 47 0.813514 IGHV3-23 IGHV4-28 47 0.810811 IGHV3-23D IGHV3-NL1 47 0.808108 IGHV3-NL1 IGHV4-28 47 0.808108 IGHV3-7 IGHV3-73 47 0.805405 IGHV3-73 IGHV3-NL1 47 0.802703 IGHV3-7 IGHV4-28 47 0.8

TABLE 22 Differentiation using combinations of any three variables of IGH genes found to have a significant difference between ME/CFS groups and MS groups (AUC ≥ 0.85) IGH gene 1 IGH gene 2 IGH gene 3 Number of data AUC value IGHD5-5 IGHV3-7 IGHV4-39 47 0.908108 IGHD5-18 IGHV3-7 IGHV4-39 47 0.908108 IGHD5-18 IGHV3-23D IGHV3-7 47 0.883784 IGHD5-18 IGHV3-23D IGHV4-39 47 0.878378 IGHD5-5 IGHV3-23D IGHV3-7 47 0.875676 IGHD5-5 IGHV3-23D IGHV4-39 47 0.872973 IGHD5-5 IGHV3-7 IGHV3-73 47 0.867568 IGHD5-18 IGHV3-7 IGHV3-73 47 0.867568 IGHD5-18 IGHV3-73 IGHV4-39 47 0.867568 IGHD5-18 IGHD5-5 IGHV3-23D 47 0.864865 IGHD5-5 IGHV3-33 IGHV4-39 47 0.862162 IGHD5-5 IGHV3-73 IGHV4-39 47 0.862162 IGHD4-17 IGHD5-5 IGHV4-39 47 0.862162 IGHD5-18 IGHV3-33 IGHV4-39 47 0.859459 IGHD4-17 IGHD5-18 IGHV4-39 47 0.859459 IGHD5-18 IGHV3-NL1 IGHV4-39 47 0.856757 IGHD5-18 IGHV4-28 IGHV4-39 47 0.856757 IGHD5-18 IGHV3-23 IGHV4-39 47 0.854054 IGHD5-18 IGHV3-23D IGHV4-28 47 0.854054 IGHD5-5 IGHV3-NL1 IGHV4-39 47 0.854054 IGHD5-18 IGHV3-23 IGHV3-7 47 0.851351 IGHD5-5 IGHV4-28 IGHV4-39 47 0.851351

Multi-variable logistic analysis was performed to study whether ME/CFS patients can be differentiated from MS patients using any two or three variables for IGH gene that were demonstrated to be capable of differentiating healthy individuals from ME/CFs patients (Tables 23 to 30). As a result, a large number of combinations with a high AUC value exhibiting high multivariate logistic model performance was obtained, which were found to be capable of predicting ME/CFS patients and MS patients with high specificity and sensitivity. When any two variables of 35 types of IGH genes exhibiting a significant difference at p<0.2 between healthy individuals and ME/CFS patients were used, 48 combinations exhibited AUC ≥0.8, and when any three variables were used, 21 combinations exhibited AUC 0.9. When any two or three variables of 18 types of IGH genes exhibiting a significant difference at p<0.1 between healthy individuals and ME/CFS patients were used, 17 combinations (AUC 0.8) and 21 combinations (AUC 0.875) exhibited a high value, respectively. When any two or three variables of 8 types of IGH genes exhibiting a significant difference at p<0.05 between healthy individuals and ME/CFS patients were used, 7 combinations (AUC 0.8) and 22 combinations (AUC 0.8) exhibited a high value, respectively. When any two or three variables of 6 types of IGH genes tested in Example 3-1 were used, 1 combination (AUC 0.7) and 11 combinations (AUC 0.7) exhibited a high value, respectively. The above results revealed that ME/CFS patients and MS patients can be differentiated with high accuracy by using a single or a plurality of IGH gene usage frequencies.

TABLE 23 Differentiation of ME/CFS from MS with combination of any two variables from IGHs exhibiting a significant difference at p < 0.2 between healthy individuals and ME/CFS patients (AUC ≥ 0.8) IGH gene 1 IGH gene 2 Number of data AUC value IGHGP IGHV4-31 47 0.886486 IGHGP IGHV3-49 47 0.87027 IGHG1 IGHGP 47 0.864865 IGHGP IGHV3-30 47 0.864865 IGHGP IGHV1-69-2 47 0.862162 IGHD4-OR15-4a/b IGHGP 47 0.862162 IGHGP IGHV3-23D 47 0.859459 IGHD4-17 IGHGP 47 0.859459 IGHGP IGHV5-10-1 47 0.859459 IGHGP IGHV3-64D 47 0.854054 IGHD4-17 IGHV3-30 47 0.851351 IGHGP IGHV3-OR16-9 47 0.851351 IGHGP IGHV3-30-5 47 0.851351 IGHD5-12 IGHGP 47 0.848649 IGHD3-22 IGHV3-23D 47 0.845946 IGHG4 IGHGP 47 0.843243 IGHD1-7 IGHGP 47 0.843243 IGHGP IGHV3-30-3 47 0.843243 IGHGP IGHV3-33 47 0.840541 IGHGP IGHJ6 47 0.840541 IGHGP IGHV4-39 47 0.835135 IGHD4-23 IGHGP 47 0.832432 IGHGP IGHV3-48 47 0.832432 IGHGP IGHV3-64 47 0.832432 IGHGP IGHV1-3 47 0.832432 IGHGP IGHV5-51 47 0.82973 IGHGP IGHV1-OR15-9 47 0.82973 IGHGP IGHV3-23 47 0.82973 IGHD1-26 IGHGP 47 0.82973 IGHD3-22 IGHGP 47 0.82973 IGHV4-31 IGHV4-39 47 0.827027 IGHV3-30 IGHV4-39 47 0.827027 IGHGP IGHV3-43D 47 0.827027 IGHV1-OR15-5 IGHV3-23 47 0.827027 IGHGP IGHV1-OR15-5 47 0.827027 IGHGP IGHV3-21 47 0.827027 IGHD6-13 IGHGP 47 0.827027 IGHGP IGHV3-73 47 0.824324 IGHV4-39 IGHV5-10-1 47 0.821622 IGHV3-30-5 IGHV3-73 47 0.818919 IGHV3-49 IGHV4-31 47 0.818919 IGHD6-6 IGHGP 47 0.818919 IGHV3-23 IGHV5-10-1 47 0.808108 IGHV4-31 IGHV5-10-1 47 0.808108 IGHV1-69-2 IGHV4-39 47 0.802703 IGHV3-23D IGHV3-30 47 0.802703 IGHV3-23D IGHV4-31 47 0.8 IGHD4-17 IGHV3-30-3 47 0.8

TABLE 24 Differentiation of ME/CFS from MS with combination of any three variables from IGHs exhibiting a significant difference at p < 0.2 between healthy individuals and ME/CFS patients (AUC ≥ 0.9) Number AUC IGH gene 1 IGH gene 2 IGH gene 3 of data value IGHD4-17 IGHGP IGHV3-30 47 0.943243 IGHGP IGHV3-30 IGHV3-30-5 47 0.937838 IGHD4-17 IGHV3-23D IGHV3-30 47 0.927027 IGHGP IGHV4-31 IGHV5-10-1 47 0.921622 IGHGP IGHV3-49 IGHV4-31 47 0.918919 IGHG1 IGHGP IGHV4-31 47 0.916216 IGHGP IGHV3-64D IGHV4-31 47 0.910811 IGHD3-22 IGHV1-OR15-5 IGHV3-23D 47 0.910811 IGHD4-OR15- IGHGP IGHV4-31 47 0.908108 4a/b IGHGP IGHV3-30 IGHV4-31 47 0.908108 IGHD4-17 IGHV3-30 IGHV3-73 47 0.905405 IGHD4-OR15- IGHGP IGHV3-49 47 0.905405 4a/b IGHGP IGHV1-69-2 IGHV4-31 47 0.902703 IGHD4-17 IGHGP IGHV4-31 47 0.902703 IGHGP IGHV1-OR15-5 IGHV4-31 47 0.902703 IGHD1-7 IGHGP IGHV4-31 47 0.902703 IGHGP IGHV3-33 IGHV4-31 47 0.902703 IGHD3-22 IGHV3-23D IGHV3-33 47 0.902703 IGHGP IGHV3-23D IGHV3-30 47 0.902703 IGHV3-49 IGHV4-31 IGHV5-10-1 47 0.9 IGHG1 IGHGP IGHV3-23D 47 0.9

TABLE 25 Differentiation of ME/CFS from MS with combination of any two variables from IGHs exhibiting a significant difference at p < 0.1 between healthy individuals and ME/CFS patients (AUC ≥ 0.8) IGH gene 1 IGH gene 2 Number of data AUC value IGHGP IGHV3-49 47 0.87027 IGHG1 IGHGP 47 0.864865 IGHGP IGHV3-30 47 0.864865 IGHGP IGHV3-64D 47 0.854054 IGHGP IGHV3-30-5 47 0.851351 IGHGP IGHV3-30-3 47 0.843243 IGHGP IGHV3-33 47 0.840541 IGHGP IGHJ6 47 0.840541 IGHD4-23 IGHGP 47 0.832432 IGHGP IGHV3-48 47 0.832432 IGHGP IGHV3-64 47 0.832432 IGHGP IGHV1-3 47 0.832432 IGHGP IGHV3-23 47 0.82973 IGHD1-26 IGHGP 47 0.82973 IGHD3-22 IGHGP 47 0.82973 IGHD6-13 IGHGP 47 0.827027 IGHD6-6 IGHGP 47 0.818919

TABLE 26 Differentiation of ME/CFS from MS with combination of any three variables from IGHs exhibiting a significant difference at p < 0.1 between healthy individuals and ME/CFS patients (AUC ≥ 0.85) Number AUC IGH gene 1 IGH gene 2 IGH gene 3 of data value IGHGP IGHV3-30 IGHV3-30-5 47 0.937838 IGHD6-6 IGHG1 IGHGP 47 0.894595 IGHG1 IGHGP IGHV3-30-5 47 0.891892 IGHG1 IGHGP IGHV3-23 47 0.891892 IGHG1 IGHGP IGHV3-30 47 0.889189 IGHGP IGHV3-30-5 IGHV3-64D 47 0.886486 IGHGP IGHV3-30 IGHV3-49 47 0.886486 IGHGP IGHV3-30 IGHV3-33 47 0.883784 IGHGP IGHV3-30-3 IGHV3-30-5 47 0.883784 IGHGP IGHV3-30 IGHV3-64D 47 0.881081 IGHG1 IGHGP IGHV3-48 47 0.881081 IGHD6-6 IGHGP IGHV3-49 47 0.878378 IGHG1 IGHGP IGHV3-33 47 0.878378 IGHD3-22 IGHGP IGHV3-33 47 0.878378 IGHD4-23 IGHG1 IGHGP 47 0.878378 IGHGP IGHV3-23 IGHV3-49 47 0.878378 IGHG1 IGHGP IGHV3-49 47 0.878378 IGHG1 IGHGP IGHV3-64D 47 0.875676 IGHGP IGHV3-49 IGHV3-64D 47 0.875676 IGHGP IGHV3-30 IGHV3-48 47 0.875676 IGHGP IGHV1-3 IGHV3-49 47 0.875676

TABLE 27 Differentiation of ME/CFS from MS with combination of any two variables from IGHs exhibiting a significant difference at p < 0.05 between healthy individuals and ME/CFS patients (AUC ≥ 0.8) IGH gene 1 IGH gene 2 Number of data AUC value IGHGP IGHV3-49 47 0.87027 IGHGP IGHV3-30 47 0.864865 IGHGP IGHV3-30-3 47 0.843243 IGHGP IGHJ6 47 0.840541 IGHGP IGHV1-3 47 0.832432 IGHD1-26 IGHGP 47 0.82973 IGHD3-22 IGHGP 47 0.82973

TABLE 28 Differentiation of ME/CFS from MS with combination of any three variables from IGHs exhibiting a significant difference at p < 0.05 between healthy individuals and ME/CFS patients (AUC ≥ 0.8) Number AUC IGH gene 1 IGH gene 2 IGH gene 3 of data value IGHGP IGHV3-30 IGHV3-49 47 0.886486 IGHGP IGHV1-3 IGHV3-49 47 0.875676 IGHD1-26 IGHGP IGHV3-49 47 0.87027 IGHGP IGHJ6 IGHV3-49 47 0.867568 IGHGP IGHV3-30 IGHV3-30-3 47 0.867568 IGHGP IGHV3-30-3 IGHV3-49 47 0.864865 IGHGP IGHV1-3 IGHV3-30 47 0.864865 IGHD3-22 IGHGP IGHV3-49 47 0.862162 IGHD1-26 IGHGP IGHV3-30 47 0.862162 IGHGP IGHJ6 IGHV3-30 47 0.859459 IGHD3-22 IGHGP IGHV3-30 47 0.851351 IGHD1-26 IGHGP IGHJ6 47 0.840541 IGHGP IGHJ6 IGHV1-3 47 0.840541 IGHGP IGHV1-3 IGHV3-30-3 47 0.837838 IGHD1-26 IGHGP IGHV3-30-3 47 0.832432 IGHD3-22 IGHGP IGHV3-30-3 47 0.832432 IGHD3-22 IGHGP IGHJ6 47 0.832432 IGHD1-26 IGHGP IGHV1-3 47 0.82973 IGHD1-26 IGHD3-22 IGHGP 47 0.82973 IGHD3-22 IGHGP IGHV1-3 47 0.82973 IGHD1-26 IGHD3-22 IGHV3-49 47 0.818919 IGHGP IGHJ6 IGHV3-30-3 47 0.818919

TABLE 29 Differentiation of ME/CFS from MS with combination of any two variables from 6 types of IGHs tested in Example 3-1 (AUC ≥ 0.7) IGH gene 1 IGH gene 2 Number of data AUC value IGHV3-30 IGHV3-49 47 0.735135

TABLE 30 Differentiation of ME/CFS from MS with combination of any three variables from 6 types of IGHs tested in Example 3-1 (AUC ≥ 0.7) Number AUC IGH gene 1 IGH gene 2 IGH gene 3 of data value IGHD1-26 IGHV3-30 IGHV3-49 47 0.764865 IGHV1-3 IGHV3-30 IGHV3-49 47 0.737838 IGHD1-26 IGHV1-3 IGHV3-49 47 0.737838 IGHV3-30 IGHV3-30-3 IGHV3-49 47 0.737838 IGHD1-26 IGHJ6 IGHV3-49 47 0.737838 IGHJ6 IGHV3-30 IGHV3-49 47 0.735135 IGHD1-26 IGHV3-30-3 IGHV3-49 47 0.776216 IGHJ6 IGHV3-30-3 IGHV3-49 47 0.710811 IGHD1-26 IGHJ6 IGHV3-30 47 0.710811 IGHJ6 IGHV1-3 IGHV3-49 47 0.705405 IGHD1-26 IGHJ6 IGHV3-30-3 47 0.705405 2-3. Differentiation of ME/CFS Patients from MS Patients by Multivariate Logistic Analysis Using any IGH Gene

Polynomial logistic analysis was performed using combinations of any two types of IGH from 74 types of IGHV, 32 types of IGHD, 6 types of IGHJ, and 5 types of IGHC for a total of 117 types of IGH genes (total of 6786 sets of combinations), and those with a high AUC value indicating high predictive performance were selected (Table 31). There were 4, 252, and 1879 sets of combinations exhibiting AUC 0.9, AUC 0.8, and AUC 0.7, respectively. Two variables IGHD3-3 and IGHGP showed the highest AUC of 0.92. Next, polynomial logistic analysis was performed using combinations of any three types of IGH from 117 types of IGH genes (total of 260130 sets of combinations), and those with a high AUC value indicating high predictive performance were selected (Table 32).

TABLE 31 Differentiation of ME/CFS from MS with combination of any two IGH genes (top 50) Number AUC Rank IGH gene 1 IGH gene 2 of data value 1 IGHD3-3 IGHGP 47 0.921622 2 IGHGP IGHV4-34 47 0.913514 3 IGHD2-21 IGHGP 47 0.910811 4 IGHGP IGHV3-72 47 0.902703 5 IGHGP IGHV4-38-2 47 0.9 6 IGHD1-1 IGHGP 47 0.894595 7 IGHD3-3 IGHV3-7 47 0.881892 8 IGHD3-OR15-3a/b IGHGP 47 0.889189 9 IGHGP IGHV4-31 47 0.886486 10 IGHGP IGHV3-7 47 0.883784 11 IGHV3-7 IGHV4-38-2 47 0.883784 12 IGHD2-2 IGHGP 47 0.881081 13 IGHD5-5 IGHGP 47 0.878378 13 IGHD5-18 IGHGP 47 0.878378 15 IGHV3-7 IGHV3-72 47 0.875676 15 IGHD3-22 IGHV3-7 47 0.875676 17 IGHGP IGHV4-30-2 47 0.872973 18 IGHD3-3 IGHV4-39 47 0.87027 19 IGHGP IGHV3-49 47 0.87027 19 IGHGP IGHV3/OR16-6 47 0.87027 19 IGHGP IGHV4-28 47 0.87027 19 IGHGP IGHJ1 47 0.87027 19 IGHV1-46 IGHV3-7 47 0.87027 24 IGHGP IGHV3-43 47 0.867568 24 IGHGP IGHV4-61 47 0.867568 26 IGHG1 IGHGP 47 0.864865 31 IGHGP IGHV3/OR16-10 47 0.859459 31 IGHGP IGHV5-10-1 47 0.859459 31 IGHD1-14 IGHGP 47 0.859459 31 IGHD4-17 IGHGP 47 0.859459 31 IGHGP IGHJ3 47 0.859459 31 IGHV1-69-2 IGHV3-7 47 0.859459 38 IGHGP IGHV3/OR15-7 47 0.856757 38 IGHD1/OR15-1a/b IGHGP 47 0.856757 40 IGHGP IGHV2-26 47 0.854054 40 IGHGP IGHV3-64D 47 0.854054 40 IGHJ4 IGHV3-7 47 0.854054 43 IGHGP IGHV3-30-5 47 0.851351 44 IGHG3 IGHGP 47 0.851351 44 IGHGP IGHV3/OR16-9 47 0.851351 44 IGHGP IGHV4/OR15-8 47 0.851351 44 IGHD4-17 IGHV3-30 47 0.851351 44 IGHJ1 IGHV4-39 47 0.851351 49 IGHGP IGHV1-45 47 0.848649 49 IGHGP IGHV3/OR16-12 47 0.848649 49 IGHGP IGHV4-59 47 0.848649 49 IGHD5-12 IGHGP 47 0.848649 49 IGHGP IGHJ4 47 0.848649

TABLE 31 Differentiation of ME/CFS from MS with combination of any three IGH genes (top 50) Rank IGH gene 1 IGH gene 2 IGH gene 3 Number of data AUC value 1 IGHD1.14 IGHD3.3 IGHGP 47 0.989189 2 IGHD3.3 IGHGP IGHV3.38.3 47 0.983784 3 IGHGP IGHV3.7 IGHV3.72 47 0.978378 4 IGHD5.5 IGHGP IGHV4.34 47 0.975676 4 IGHD5.18 IGHGP IGHV4.34 47 0.975676 6 IGHD1.1 IGHGP IGHV4.34 47 0.972973 7 IGHGP IGHV3.64D IGHV4.34 47 0.967568 8 IGHD3.3 IGHGP IGHV3.72 47 0.964865 8 IGHD2.21 IGHD3.3 IGHGP 47 0.964865 10 IGHD2.21 IGHGP IGHV4.31 47 0.962162 10 IGHD2.21 IGHGP IGHV4.34 47 0.962162 12 IGHGP IGHV3.30 IGHV4.34 47 0.959459 12 IGHD3.3 IGHGP IGHV3.30 47 0.959459 12 IGHGP IGHV4.34 IGHV5.10.1 47 0.959459 12 IGHGP IGHJ1 IGHV4.34 47 0.959459 12 IGHV3.64 IGHV3.7 IGHV3.72 47 0.959459 12 IGHD3.3 IGHV3.30 IGHV4.39 47 0.959459 18 IGHD3.3 IGHGP IGHV3.49 47 0.956757 19 IGHD3.3 IGHGP IGHV3.7 47 0.954054 19 IGHGP IGHV3.64 IGHV4.34 47 0.954054 19 IGHGP IGHV3.OR15.7 IGHV4.34 47 0.954054 19 IGHGP IGHJ1 IGHV4.28 47 0.954054 19 IGHD3.OR15.3a IGHGP IGHV4.34 47 0.954054 19 IGHD2.2 IGHD2.21 IGHGP 47 0.954054 25 IGHD3.3 IGHGP IGHV2.5 47 0.951351 25 IGHGP IGHV3.49 IGHV4.34 47 0.951351 25 IGHD3.3 IGHV3.64 IGHV3.7 47 0.951351 25 IGHD3.22 IGHV3.7 IGHV3.72 47 0.951351 25 IGHD3.22 IGHV3.7 IGHV4.38.2 47 0.951351 30 IGHGP IGHV3.43 IGHV4.34 47 0.948649 30 IGHD2.21 IGHGP IGHV3.43 47 0.948649 30 IGHD2.21 IGHGP IGHV4.38.2 47 0.948649 33 IGHD3.3 IGHG1 IGHGP 47 0.948649 33 IGHGP IGHV1.46 IGHV4.34 47 0.948649 33 IGHD2.21 IGHGP IGHV3.23D 47 0.948649 33 IGHD1.1 IGHGP IGHV3.72 47 0.948649 33 IGHGP IGHV4.34 IGHV4.38.2 47 0.948649 38 IGHG1 IGHGP IGHV4.34 47 0.945946 38 IGHD2.21 IGHGP IGHV3.7 47 0.945946 38 IGHGP IGHV3.72 IGHV4.34 47 0.945946 38 IGHD3.3 IGHGP IGHV3.OR16.13 47 0.945946 38 IGHD3.3 IGHGP IGHV4.34 47 0.945946 38 IGHGP IGHJ3 IGHV4.34 47 0.945946 38 IGHD3.3 IGHGP IGHV4.38.2 47 0.945946 38 IGHD3.3 IGHV2.5 IGHV4.39 47 0.945946 46 IGHD2.21 IGHG1 IGHGP 47 0.943243 46 IGHD3.3 IGHGP IGHV3.25 47 0.943243 46 IGHD4.17 IGHGP IGHV3.30 47 0.943243 46 IGHGP IGHV3.53 IGHV4.34 47 0.943243 46 IGHD3.3 IGHGP IGHV3.64D 47 0.943243 46 IGHD2.21 IGHGP IGHV3.72 47 0.943243 46 IGHGP IGHV3.OR15.7 IGHV4.38.2 47 0.943243 46 IGHGP IGHV3.OR16.6 IGHV4.34 47 0.943243 46 IGHGP IGHV3.OR16.6 IGHV4.38.2 47 0.943243 46 IGHD3.3 IGHGP IGHV4.61 47 0.943243 46 IGHD1.1 IGHD2.2 IGHGP 47 0.943243 46 IGHD1.1 IGHD3.3 IGHGP 47 0.943243 46 IGHD2.21 IGHGP IGHJ3 47 0.943243 46 IGHV3.7 IGHV3.OR15.7 IGHV4.38.2 47 0.943243 46 IGHD2.8 IGHD3.3 IGHV3.7 47 0.943243 46 IGHD3.3 IGHJ1 IGHV4.39 47 0.943243 2-4. Comparison of ME/CFS Patients with Non-ME/CFS Patients

In order to study IGH genes that differentiate ME/CFS patients from non-ME/CFS patients, 23 healthy individuals were added to 10 MS patients as the non-ME/CFS group, and the usage frequencies of IGH genes were compared between said group and a group of 37 ME/CFS patients. Mann-Whitney test between the two groups elucidated that 9 genes, i.e., IGHV3-49 (P=0.0013), IGHV3-30-3 (P=0.0022), IGHD1-26 (P=0.0053), IGHV4-34 (P=0.0118), IGHV3-30 (P=0.186), IGHV4-31 (P=0.0205), IGHV3-64 (P=0.0286), IGHJ6 (P=0.0304), and IGHD5-10-1 (P=0.0373), were higher in ME/CFS patients compared to non-ME/CFS patients. Meanwhile, IGHGP (P=0.0061), IGHD3-22 (P=0.0313), IGHV3-33 (P=0.0332), and IGHV3-73 (P=0.0332) were significantly higher in the non-ME/CFS patient group (Table 33, FIG. 11).

TABLE 33 IGH genes detected to have a significant difference between ME/CFS group and ME/CFS group Mean value Standard deviation Non-ME/ Non-ME- P IGH gene CFS ME/CFS CFS ME/CFS value* IGHV3-49 1.11 2.63 0.99 2.63 0.001 IGHV3-30-3 0.81 1.44 0.75 1.05 0.002 IGHD1-26 2.88 6.31 3.23 5.84 0.005 IGHGP 0.13 0.03 0.29 0.03 0.006 IGHV4-34 3.06 3.97 3.37 2.45 0.011 IGHV3-30 0.96 1.43 0.67 0.87 0.018 IGHV4-31 0.56 1.06 0.67 1.08 0.020 IGHV3-64 0.67 1.04 0.70 0.99 0.028 IGHJ6 16.00 20.06 6.10 7.87 0.030 IGHD3-22 14.36 8.91 11.61 7.21 0.031 IGHV3-33 3.08 2.21 1.82 1.09 0.033 IGHV3-73 1.82 0.81 2.69 1.45 0.033 IGHV5-10-1 0.34 1.04 0.91 1.85 0.037 *Mann-Whitney test 2-5. Differentiation of ME/CFS Patients from Non-ME/CFS Patients by Multivariate Logistic Analysis Using any IGH Gene

Polynomial logistic regression analysis was performed using any two IGHs from 74 types of IGHV, 32 types of IGHD, 6 types of IGHJ, and 5 types of IGHC for a total of 117 types of IGH genes (total of 6786 combinations), and those with a high AUC value indicating high predictive performance in differentiating ME/CFS patients from non-ME/CFS patients were selected (Table 34). IGHs exhibiting a high AUC value when using three variables (total of 260130 sets of combinations) were also selected in the same manner (Table 35). With any two variables, there were 5 and 469 sets of combinations exhibiting AUC 0.8 and AUC 0.7, respectively. With any three variables, there were 37 and 1164 sets of combinations exhibiting AUC 0.85 and AUC 0.8, respectively. With two variables, the combination of IGHGP and IGH3-30-3 showed the highest AUC of 0.820. With any three variables, the combination of IGHGP, IGHV3-30, and IGHV3-49 showed the highest AUC of 0.889.

TABLE 34 Differentiation of ME/CFS from non-ME/CFS with combination of any two variables (top 50) Number AUC Rank IGH gene 1 IGH gene 2 of data value 1 IGHGP IGHV3-30-3 70 0.81982 1 IGHD3-22 IGHV3-49 70 0.81982 3 IGHGP IGHV3-30 70 0.812449 4 IGHGP IGHV3-49 70 0.804259 5 IGHGP IGHV4-34 70 0.801802 6 IGHV3-49 IGHV4-31 70 0.799345 7 IGHV3-49 IGHV3-64D 70 0.796888 8 IGHV3-30-3 IGHV3-49 70 0.796069 9 IGHD1-26 IGHD6-6 70 0.790336 10 IGHV1-3 IGHV3-49 70 0.788698 10 IGHV3-30 IGHV3-49 70 0.788698 12 IGHD6-6 IGHGP 70 0.78706 13 IGHD4-17 IGHV3-30-3 70 0.786241 14 IGHD1-26 IGHV3-49 70 0.782146 14 IGHD4-23 IGHV3-49 70 0.782146 16 IGHD3-3 IGHV3-49 70 0.781327 17 IGHV1-69-2 IGHV3-49 70 0.779689 18 IGHV3-49 IGHV3-73 70 0.777232 19 IGHG4 IGHV3-49 70 0.775594 19 IGHV1/OR15-9 IGHV3-30-3 70 0.775594 21 IGHV3-30-3 IGHV3-64D 70 0.774775 22 IGHV3-49 IGHV3-7 70 0.773956 22 IGHJ6 IGHV3-30-3 70 0.773956 22 IGHV3-33 IGHV3-49 70 0.773956 25 IGHD3-22 IGHV3-30-3 70 0.773137 26 IGHV3 49 IGHV3-64 70 0.773137 26 IGHV3-49 IGHV4-30-4 70 0.773137 28 IGHGP IGHV4-31 70 0.771499 29 IGHD1-26 IGHV3-73 70 0.769861 30 IGHV3-30-3 IGHV4-28 70 0.768223 31 IGHD6-6 IGHV3 49 70 0.765766 32 IGHV1/OR21-1 IGHV3-49 70 0.764947 33 IGHGP IGHV3-23 70 0.754128 34 IGHV1-69-2 IGHV3-30-3 70 0.763309 34 IGHV3-49 IGHV4-39 70 0.763309 36 IGHV1/OR15-5 IGHV3-49 70 0.76249 37 IGHG4 IGHV3-30-3 70 0.761671 38 IGHV3-30-3 IGHV3-43D 70 0.761671 39 IGHV3-49 IGHV5-10-1 70 0.760852 40 IGHV3-30-3 IGHV4-31 70 0.760033 40 IGHD3-3 IGHV3-30-3 70 0.760033 42 IGHD1-20 IGHGP 70 0.759214 43 IGHV3-49 IGHV3/OR16-12 70 0.758395 43 IGHV3-49 IGHV4-30-2 70 0.758395 45 IGHD6-6 IGHV3-30-3 70 0.757576 46 IGHV1/OR15-9 IGHV3-49 70 0.756757 46 IGHV3-30-5 IGHV3-49 70 0.756757 48 IGHV3-49 IGHV3/OR16-13 70 0.755938 49 IGHV3-20 IGHV3-49 70 0.755119 49 IGHJ4 IGHV3-30-3 70 0.755119 49 IGHD4/OR15-4a/b IGHV3-49 70 0.755119

TABLE 35 Differentiation of ME/CFS from non-ME/CFS with combination of any three variables (top 50) Rank IGH gene 1 IGH gene 2 IGH gene 3 Number of data AUC value 1 IGHGP IGHV3-30 IGHV3-49 70 0.8886 2 IGHGP IGHV3-30 IGHV4-34 70 0.8878 2 IGHGP IGHV3-30-3 IGHV3-49 70 0.8878 4 IGHGP IGHV3-30-3 IGHV4-28 70 0.8763 4 IGHD6-6 IGHGP IGHV3-30-3 70 0.8763 6 IGHGP IGHV3-30 IGHV3-72 70 0.8731 6 IGHD1-20 IGHGP IGHV3-49 70 0.8731 8 IGHV3-49 IGHV3-64D IGHV3-7 70 0.8714 9 IGHD4-17 IGHGP IGHV3-30 70 0.8690 10 IGHD6-6 IGHGP IGHV3-30 70 0.8673 11 IGHD3-22 IGHGP IGHV3-49 70 0.8657 12 IGHD3-3 IGHGP IGHV3-30 70 0.8649 12 IGHV3-49 IGHV3-64D IGHV4-31 70 0.8649 12 IGHD3-22 IGHV3-49 IGHV3-73 70 0.8649 15 IGHGP IGHV3-30-3 IGHV4-34 70 0.8624 16 IGHD3-22 IGHV3-49 IGHV4-31 70 0.8616 17 IGHGP IGHV3-30-3 IGHV3-72 70 0.8608 17 IGHD3-22 IGHV1-69-2 IGHV3-49 70 0.8608 19 IGHGP IGHV3-30 IGHV4-28 70 0.8591 20 IGHD3-22 IGHV3-49 IGHV3-64D 70 0.8583 21 IGHD1-20 IGHGP IGHV3-30-3 70 0.8575 21 IGHD3-3 IGHGP IGHV3-30-3 70 0.8575 21 IGHGP IGHV3-49 IGHV4-34 70 0.8575 24 IGHD2-21 IGHGP IGHV3-30-3 70 0.8559 25 IGHGP IGHJ6 IGHV3-30-3 70 0.8550 25 IGHGP IGHV3-64D IGHV4-34 70 0.8550 27 IGHGP IGHV3-30-3 IGHV3/OR16-13 70 0.8542 27 IGHGP IGHV3-49 IGHV4-31 70 0.8542 27 IGHD3-22 IGHD4/OR15-4a/b IGHV3-49 70 0.8542 30 IGHGP IGHV3-49 IGHV3-73 70 0.8534 31 IGHD6-6 IGHGP IGHV3-49 70 0.8518 31 IGHV1-69-2 IGHV3-49 IGHV4-31 70 0.8518 33 IGHGP IGHV3-30 IGHV3/OR16-8 70 0.8509 33 IGHD4-17 IGHGP IGHV3-30-3 70 0.8509 33 IGHGP IGHV3/OR16-13 IGHV4-34 70 0.8509 36 IGHGP IGHV3-30 IGHV3-33 70 0.8501 36 IGHD3-22 IGHV3-49 IGHV3-64 70 0.8501 38 IGHGP IGHV3-30-3 IGHV3/OR16-8 70 0.8493 39 IGHGP IGHV1/OR15-9 IGHV3-30-3 70 0.8493 40 IGHGP IGHV3-30-3 IGHV3-64D 70 0.8485 41 IGHGP IGHV3-30 IGHV7-81 70 0.8477 41 IGHD3-22 IGHV1-3 IGHV3-49 70 0.8477 43 IGHGP IGHV4-34 IGHV5-10-1 70 0.8468 43 IGHD1-26 IGHGP IGHV4-34 70 0.8468 43 IGHD1-26 IGHD6-6 IGHGP 70 0.8468 43 IGHD3-22 IGHV3-38 IGHV3-49 70 0.8468 47 IGHGP IGHV1-3 IGHV3-49 70 0.8460 47 IGHV3-30-3 IGHV3-33 IGHV3-49 70 0.8460 47 IGHD3-22 IGHV3-49 IGHV4-28 70 0.8460 50 IGHV3-30 IGHV3-33 IGHV3-49 70 0.8460

3. Prediction Differentiation Model for ME/CFS Differential Diagnosis

A prediction model formula for differentiating ME/CFS patients was created using usage frequency data for IGH genes obtained from BCR repertoire analysis. A differential model according to a logistic regression formula was created using a combination of IGH genes with a high AUC value in polynomial logistic analysis or significance test between two groups. An example of differential diagnosis is shown below.

For example, a combination of variables can be identified as follows.

(1) Healthy individuals and ME/CFS patients are compared, and a variable detected to have a significant difference (e.g., usage frequency of a gene in an IgG H chain variable region of a BCR) is selected; or (2) Univariate logistic analysis is performed with one variable (e.g., one gene in an IgG H chain variable region of a BCR) as an independent variable, or multivariate logistic analysis is performed with two or more variables (e.g., two or more genes) as independent variables to obtain a logistic regression model formula. ROC analysis that measures the degree of fit of the regression model is performed to select a variable exhibiting a higher AUC value.

When a combination of variables is provided, univariate or multivariate logistic regression analysis is performed for each variable (e.g., frequency data for genes) (x₁, x₂, x₃, . . . ), with ME/CFS patient (y=1) or healthy individual (y=0) as the objective variable. In case of (2), logistic analysis is performed for every combination using a plurality of variables.

While a function for generalized linear models glm of the R package can be utilized for logistic regression analysis, analysis is not limited to this package. With logistic regression analysis, the following logit model formula is obtained with the probability of having ME/CFS as n, and the constant of b0 and partial regression coefficients corresponding to b1 to bp are found at the same time. Specifically, the coefficients of a logit model formula can be determined by using a data set of variables (gene frequency and the like) and distinction of patient/healthy individual.

$\begin{matrix} {{\ln\left( \frac{\pi}{1 - \pi} \right)} = {b_{0} + {b_{1}x_{1}} + {b_{2}x_{2}} + \mspace{14mu}\ldots\mspace{14mu} + {b_{p}x_{p}}}} & \left\lbrack {{Numeral}\mspace{14mu} 1} \right\rbrack \end{matrix}$

Π: probability of having ME/CFS, b0: constant, b1 to bp: partial regression coefficients

The probability n of having ME/CFS can be determined by the following equation once the constant and partial regression coefficients are determined. If frequency data is newly obtained, differentiation and prediction can be performed by inputting the values thereof (differentiation and prediction with 0.5 or greater as ME/CFS or the like).

$\begin{matrix} {\pi = \frac{1}{1 + e^{- {({b_{0} + {b_{1}x_{1}} + {b_{2}x_{2}} + \mspace{14mu}\ldots\mspace{14mu} + {b_{p}x_{p}}})}}}} & \left\lbrack {{Numeral}\mspace{14mu} 2} \right\rbrack \end{matrix}$

Two steps enable a prediction method for predicting ME/CFS with a differentiation formula for distinguishing ME/CFS from healthy individuals and excluding MS patients with a differentiation formula for distinguishing ME/CFS from MS. Basically, frequency data can be inputted into the same logit model formula described above to predict the probability thereof.

3-1. One Step Prediction Model

ME/CFS and non-ME/CFS (healthy individuals and MS patients) are predicted with a single differentiation formula. Patients are differentiated with a single step by one of the following examples or a combination of other predictor variables.

$\begin{matrix} {\left( {{Example}\mspace{14mu}{using}\mspace{14mu} 13\mspace{14mu}{types}\mspace{14mu}{of}\mspace{14mu}{IGHs}\mspace{14mu}{found}\mspace{14mu}{to}\mspace{14mu}{have}\mspace{14mu} a\mspace{14mu}{significant}} \right)\text{}{{{Predictor}\mspace{14mu}{variables}\text{:}\mspace{14mu}{IGHV}\; 3\text{-}49},{{IGHV}\; 3\text{-}30\text{-}3},{{IGHD}\; 1\text{-}26},{IGHGP},{{IGHV}\; 4\text{-}34},{{IGHV}\; 3\text{-}30},{{IGHV}\; 4\text{-}31},{{IGHV}\; 3\text{-}64},{{IGHJ}\; 6},{{IGHD}\; 3\text{-}22},{{IGHV}\; 3\text{-}33},{{IGHV}\; 3\text{-}73},{{IGHV}\; 5\text{-}10\text{-}1}}{{{Differentiation}\mspace{14mu}{formula}\text{:}\mspace{14mu}{\log\left( \frac{\pi}{1 - \pi} \right)}} = {{{- 2.445}e^{2}} + {1.830{e^{2}\left( {\%\mspace{14mu}{IGHV}\; 3\text{-}49} \right)}} + {1.348{e^{2}\left( {\%\mspace{14mu}{IGHV}\; 4\text{-}34} \right)}} + {8.355{e^{1}\left( {\%\mspace{14mu}{IGHV}\; 3\text{-}30} \right)}} + {7.964{e^{1}\left( {\%\mspace{14mu}{IGHV}\; 4\text{-}31} \right)}} - {2.840\left( {\%\mspace{14mu}{IGHV}\; 3\text{-}64} \right)} + {5.919\left( {\%\mspace{14mu}{IGHJ}\; 6} \right)} - {1.543{e^{1}\left( {\%\mspace{14mu}{IGHD}\; 3\text{-}22} \right)}} - {1.146{e^{3}\left( {\%\mspace{14mu}{IGHV}\; 3\text{-}33} \right)}} - {1.287{e^{1}\left( {\%\mspace{14mu}{IGHV}\; 3\text{-}73} \right)}} + {1.306^{2}\left( {\%\mspace{14mu}{IGHV}\; 5\text{-}10\text{-}1} \right)}}}} & \left\lbrack {{Numeral}\mspace{14mu} 3} \right\rbrack \end{matrix}$

Π: probability of having ME/CFS, Π/(1−Π) is the odds ratio (Example using IGHs of two variables with a high AUC value)

Predictor  variables:  IGHGP, IGHV 3-30-3 ${{{Differentiation}\mspace{14mu}{formula}\text{:}\mspace{14mu}{\log\left( \frac{\pi}{1 - \pi} \right)}} = {{{- 3.094}e^{- 1}} - {2.0{.05}{e^{1}\left( {\%\mspace{14mu}{IGHGP}} \right)}} + {1.496{e^{1}\left( {\%\mspace{14mu}{IGHV}\; 3\text{-}30\text{-}3} \right)}}}}\mspace{11mu}$

(Example using IGHs of three variables with a high AUC value)

Predictor  variables:  IGHGP, IGHV 3-30-3, IGHV 3-49 ${{{Differentiation}\mspace{14mu}{formula}\text{:}\mspace{14mu}{\log\left( \frac{\pi}{1 - \pi} \right)}} = {{- 1.770} - {2.269{e^{1}\left( {\%\mspace{14mu}{IGHGP}} \right)}} + {1.778\left( {\%\mspace{14mu}{IGHV}\; 3\text{-}30\text{-}3} \right)} + {8.467{e^{- 1}\left( {\%\mspace{14mu}{IGHV}\; 3\text{-}49} \right)}}}}\mspace{20mu}$

3-2. Two Step Prediction Model

ME/CFS is differentiated from healthy individuals with differentiation formula 1, and then patients with other diseases such as MS are excluded with differentiation formula 2 to predict ME/CFS patients. Patients are differentiated with two steps by one of the following examples or a combination of other predictor variables.

1) Differentiation of ME/CFS patients from healthy individuals (differentiation formula 1)

[Numeral 4]

(Example using 6 types of IGHs found to have a significant difference between the two groups)

Predictor  variables:  IGHV 1-3, IGHV 3-30, IGHD 1-26, IGHV 3-49, IGHJ 6 ${{Differentiation}\mspace{14mu}{formula}\text{:}\mspace{14mu}{\log\left( \frac{\pi}{1 - \pi} \right)}} = {{- 6.104} + {4.368{e^{1}\left( {\%\mspace{14mu}{IGHV}\; 1\text{-}3} \right)}} - {6.1560{e^{- 1}\left( {\%\mspace{14mu}{IGHV}\; 3\text{-}30} \right)}} + {1.838\left( {\%\mspace{14mu}{IGHV}\; 3\text{-}30\text{-}3} \right)} + {9.945{e^{- 1}\left( {\%\mspace{14mu}{IGHV}\; 3\text{-}49} \right)}} + {1.7019{e^{- 1}\left( {\%\mspace{14mu}{IGHD}\; 1\text{-}26} \right)}} + {1.1956{e^{- 1}\left( {\%\mspace{14mu}{IGHJ}\; 6} \right)}}}$

(Example using IGHs of two variables with a high AUC value)

Predictor  variables:  IGHGP, IGHV 3-30-3 ${\log\left( \frac{\pi}{1 - \pi} \right)} = {{{- 3.883}e^{- 1}} - {2.096{e^{1}\left( {\%\mspace{14mu}{IGHGP}} \right)}} + {2.169\left( {\%\mspace{14mu}{IGHV}\; 3\text{-}30\text{-}3} \right)}}$

(Example using IGHs of three variables with a high AUC value)

Predictor  variables:  IGHGP, IGHV 3-30-3, IGHV 3-49 ${\log\left( \frac{\pi}{1 - \pi} \right)} = {{- 2.464} - {2.561{e^{1}\left( {\%\mspace{14mu}{IGHGP}} \right)}} + {2.877\left( {{\%\mspace{14mu}{IGHV}\;{Predictor}\mspace{14mu}{variables}\text{:}\mspace{14mu}{IGHGP}},{{IGHV}\; 3\text{-}30\text{-}3},{{IGHV}\; 3\text{-}49}} \right)}}$

2) Differentiation to exclude other diseases (MS patient) from ME/CFS patients (differentiation formula 2)

$\begin{matrix} {{{{Predictor}\mspace{14mu}{variables}\text{:}\mspace{14mu}{IGHV}\; 3\text{-}7},{{IGHV}\; 3\text{-}23},{{IGHV}\; 3\text{-}23D},{{IGHV}\; 3\text{-}33},{{IGHV}\; 3\text{-}73},{{IGHV}\; 3\text{-}{NL}\; 1},{{IGHV}\; 4\text{-}28},{{IGHV}\; 4\text{-}39},{{IGHD}\; 4\text{-}17},{{IGHD}\; 5\text{-}5.{IGHD}\; 5\text{-}18}}{{{Differentiation}\mspace{14mu}{formula}\text{:}\mspace{14mu}{\log\left( \frac{\pi}{1 - \pi} \right)}} = {4.815 - {4.252{e^{- 1}\left( {\%\mspace{14mu}{IGHV}\; 3\text{-}7} \right)}} - {6.898{e^{- 2}\left( {\%\mspace{14mu}{IGHV}\; 3\text{-}23} \right)}} + {5.111\left( {\%\mspace{14mu}{IGHV}\; 3\text{-}23D} \right)} - {3.214{e^{- 1}\left( {\%\mspace{14mu}{IGH}\; 3\text{-}33} \right)}} - {5.660{e^{- 1}\left( {\%\mspace{14mu}{IGHV}\; 3\text{-}73} \right)}} - {9.159{e^{- 1}\left( {\%\mspace{14mu}{IGHV}\; 3\text{-}{NL}\; 1} \right)}} - {4.658{e^{1}\left( {\%\mspace{14mu}{IGHD}\; 4\text{-}28} \right)}} - {8.966{e^{- 2}\left( {\%\mspace{14mu}{IGHV}\; 4\text{-}39} \right)}} - {9.326{e^{- 2}\left( {\%\mspace{14mu}{IGHD}\; 4\text{-}17} \right)}} + {3.157{e^{1}\left( {\%\mspace{14mu}{IGHD}\; 5\text{-}5} \right)}} - {3.240{e^{1}\left( {\%\mspace{14mu}{IGD}\; 5\text{-}18} \right)}}}}} & \left\lbrack {{Numeral}\mspace{14mu} 5} \right\rbrack \end{matrix}$

Differentiation formula using two types of IGHs as a predictor variable

Predictor  variables:  IGHGP, IGHV 3-3 ${\log\left( \frac{\pi}{1 - \pi} \right)} = {3.023 + {6.026{e^{- 1}\left( {\%\mspace{14mu}{IGHD}\; 3\text{-}3} \right)}} - {7.002{e^{1}\left( {\%\mspace{14mu}{IGHGP}} \right)}}}$

Differentiation formula using three types of IGHs as a predictor variable

Predictor  variables:  IGHGP, IGHV 3-3, IGHD 1-14 ${\log\left( \frac{\pi}{1 - \pi} \right)} = {5.240 + {1.492\left( {\%\mspace{14mu}{IGHD}\; 3\text{-}3} \right)} - {1.379\left( {\%\mspace{14mu}{IGHGP}} \right)} - {7.432\left( {\%\mspace{14mu}{IGHD}\; 1\text{-}14} \right)}}$

(Note)

As described above, the present disclosure is exemplified by the use of its preferred embodiments. However, it is understood that the scope of the present disclosure should be interpreted solely based on the Claims. It is also understood that any patent, any patent application, and any references cited herein should be incorporated herein by reference in the same manner as the contents are specifically described herein.

RELATED APPLICATION

The present application claims priority to Japanese Patent Application No. 2018-155380 filed on Aug. 22, 2018 and Japanese Patent Application No. 2019-44885 filed on Mar. 12, 2019. The entire content thereof is incorporated herein by reference for any purpose.

INDUSTRIAL APPLICABILITY

The present disclosure can be used in diagnosis/diagnostic drug for myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS).

SEQUENCE LISTING FREE TEXT

SEQ ID NO: 1: BSL18E primer SEQ ID NO: 2: P20EA primer SEQ ID NO: 3: P10EA primer SEQ ID NO: 4: CG1 primer SEQ ID NO: 5: CG2 primer SEQ ID NO: 6: P22EA-ST1-R primer SEQ ID NO: 7: CG-ST1-R primer 

1. A method of diagnosing a subject as suffering myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), comprising: determining a B cell receptor (BCR) repertoire in a subject; and diagnosing the subject as suffering from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) based on the BCR repertoire.
 2. The method of claim 1, wherein the diagnosing is based on one or more variables comprising a usage frequency of one or more genes in an IgG H chain variable region of a BCR in a subject as an indicator of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in the subject, and not another disease.
 3. (canceled)
 4. The method of claim 1, wherein the one or more genes comprise at least one gene selected from the group consisting of IGHV1-2, IGHV1-3, IGHV1-8, IGHV1-18, IGHV1-24, IGHV1-38-4, IGHV1-45, IGHV1-46, IGHV1-58, IGHV1-69, IGHV1-69-2, IGHV1-69D, IGHV1/OR15-1, IGHV1/OR15-5, IGHV1/OR15-9, IGHV1/OR21-1, IGHV2-5, IGHV2-26, IGHV2-70, IGHV2-70D, IGHV2/OR16-5, IGHV3-7, IGHV3-9, IGHV3-11, IGHV3-13, IGHV3-15, IGHV3-16, IGHV3-20, IGHV3-21, IGHV3-23, IGHV3-23D, IGHV3-25, IGHV3-30, IGHV3-30-3, IGHV3-30-5, IGHV3-33, IGHV3-35, IGHV3-38, IGHV3-38-3, IGHV3-43, IGHV3-43D, IGHV3-48, IGHV3-49, IGHV3-53, IGHV3-64, IGHV3-64D, IGHV3-66, IGHV3-72, IGHV3-73, IGHV3-74, IGHV3-NL1, IGHV3/OR15-7, IGHV3/OR16-6, IGHV3/OR16-8, IGHV3/OR16-9, IGHV3/OR16-10, IGHV3/OR16-12, IGHV3/OR16-13, IGHV4-4, IGHV4-28, IGHV4-30-2, IGHV4-30-4, IGHV4-31, IGHV4-34, IGHV4-38-2, IGHV4-39, IGHV4-59, IGHV4-61, IGHV4/OR15-8, IGHV5-10-1, IGHV5-51, IGHV6-1, IGHV7-4-1, IGHV7-81, IGHD1-1, IGHD1-7, IGHD1-14, IGHD1-20, IGHD1-26, IGHD1/OR15-1a/b, IGHD2-2, IGHD2-8, IGHD2-15, IGHD2-21, IGHD2/OR15-2a/b, IGHD3-3, IGHD3-9, IGHD3-10, IGHD3-16, IGHD3-22, IGHD3/OR15-3a/b, IGHD4-4, IGHD4-11, IGHD4-17, IGHD4-23, IGHD4/OR15-4a/b, IGHD5-5, IGHD5-12, IGHD5-18, IGHD5-24, IGHD5/OR15-5a/b, IGHD6-6, IGHD6-13, IGHD6-19, IGHD6-25, IGHD7-27, IGHJ1, IGHJ2, IGHJ3, IGHJ4, IGHJ5, IGHJ6, IGHG1, IGHG2, IGHG3, IGHG4, and IGHGP. 5.-8. (canceled)
 9. The method of claim 2, wherein the one or more variables further comprise one or more immune cell subpopulation counts, wherein the one or more variables exhibit AUC ≥0.7 under a ROC curve in regression analysis for differentiating a normal control from ME/CFS or wherein the one or more variables exhibit AUC ≥0.8 under a ROC curve in regression analysis for differentiating a normal control from ME/CFS, and/or wherein the immune cell subpopulation count is selected from a B cell count, a naïve B cell count, a memory B cell count, a plasmablast count, an activated naïve B cell count, a transitional B cell count, a regulatory T cell count, a memory T cell count, a follicular helper T cell count, a Tfh1 cell count, a Tfh2 cell count, a Tfh17 cell count, a Th1 cell count, a Th2 cell count, and a Th17 cell count. 10.-12. (canceled)
 13. The method of claim 2, wherein the one or more variables comprise usage frequencies of two or more genes in an IgG H chain variable region of a BCR in the subject, and/or wherein the one or more variables exhibit AUC ≥0.7 under a ROC curve in regression analysis for differentiating a normal control from ME/CFS, wherein the one or more variables exhibit AUC ≥0.8 under a ROC curve in regression analysis for differentiating a normal control from ME/CFS or wherein the one or more variables exhibit AUC ≥0.9 under a ROC curve in regression analysis for differentiating a normal control from ME/CFS. 14.-16. (canceled)
 17. The method of claim 2, wherein the one or more variables comprise a combination of two or more variables selected from the group consisting of a usage frequency of one or more genes in an IgG H chain variable region of a BCR in the subject, a BCR diversity index in the subject, and one or more immune cell subpopulation counts in the subject, and wherein the one or more variables exhibit AUC ≥0.7 under a ROC curve in regression analysis for differentiating a normal control from ME/CFS, wherein the one or more variables comprise three, four, five or six of more variables selected from the group, and/or wherein the one or more variables exhibit AUC ≥0.8 under a ROC curve in regression analysis for differentiating a normal control from ME/CFS or wherein the one or more variables exhibit AUC ≥0.9 under a ROC curve in regression analysis for differentiating a normal control from ME/CF S. 18.-24. (canceled)
 25. The method of claim 2, wherein the one or more variables comprise a B cell count in the subject, wherein the one or more variables comprise a regulatory T cell (Treg) count in the subject, wherein the usage frequency of the one or more genes is determined by a method comprising large scale highly efficient BCR repertoire analysis, and/or wherein the one or more variables comprise a usage frequency of at least one gene selected from the group consisting of IGHV3-49, IGHV3-30-3, IGHD1-26, IGHV3-30, IGHJ6, IGHGP, IGHV4-31, IGHV3-64, IGHD3-22, IGHV3-33, IGHV3-73, IGHV5-10-1, and IGHV4-34. 26.-28. (canceled)
 29. The method of claim 1, comprising: (a) using a part of one or more variables or diagnosing ME/CFS in the subject; and (b) using a part of the one or more variables or diagnosing the subject having ME/CFS, and not another disease.
 30. The method of claim 29, wherein (b) is performed a plurality of times for a plurality of other diseases.
 31. The method of claim 28, wherein the another disease comprises multiple sclerosis (MS).
 32. A method of diagnosing a subject as suffering from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), and not multiple sclerosis (MS) based on one or more variables comprising a usage frequency of one or more genes in an IgG H chain variable region of a BCR in the subject.
 33. The method of claim 32, wherein the one or more genes comprise at least one gene selected from the group consisting of IGHV1-2, IGHV1-3, IGHV1-8, IGHV1-18, IGHV1-24, IGHV1-38-4, IGHV1-45, IGHV1-46, IGHV1-58, IGHV1-69, IGHV1-69-2, IGHV1-69D, IGHV1/OR15-1, IGHV1/OR15-5, IGHV1/OR15-9, IGHV1/OR21-1, IGHV2-5, IGHV2-26, IGHV2-70, IGHV2-70D, IGHV2/OR16-5, IGHV3-7, IGHV3-9, IGHV3-11, IGHV3-13, IGHV3-15, IGHV3-16, IGHV3-20, IGHV3-21, IGHV3-23, IGHV3-23D, IGHV3-25, IGHV3-30, IGHV3-30-3, IGHV3-30-5, IGHV3-33, IGHV3-35, IGHV3-38, IGHV3-38-3, IGHV3-43, IGHV3-43D, IGHV3-48, IGHV3-49, IGHV3-53, IGHV3-64, IGHV3-64D, IGHV3-66, IGHV3-72, IGHV3-73, IGHV3-74, IGHV3-NL1, IGHV3/OR15-7, IGHV3/OR16-6, IGHV3/OR16-8, IGHV3/OR16-9, IGHV3/OR16-10, IGHV3/OR16-12, IGHV3/OR16-13, IGHV4-4, IGHV4-28, IGHV4-30-2, IGHV4-30-4, IGHV4-31, IGHV4-34, IGHV4-38-2, IGHV4-39, IGHV4-59, IGHV4-61, IGHV4/OR15-8, IGHV5-10-1, IGHV5-51, IGHV6-1, IGHV7-4-1, IGHV7-81, IGHD1-1, IGHD1-7, IGHD1-14, IGHD1-20, IGHD1-26, IGHD1/OR15-1a/b, IGHD2-2, IGHD2-8, IGHD2-15, IGHD2-21, IGHD2/OR15-2a/b, IGHD3-3, IGHD3-9, IGHD3-10, IGHD3-16, IGHD3-22, IGHD3/OR15-3a/b, IGHD4-4, IGHD4-11, IGHD4-17, IGHD4-23, IGHD4/OR15-4a/b, IGHD5-5, IGHD5-12, IGHD5-18, IGHD5-24, IGHD5/OR15-5a/b, IGHD6-6, IGHD6-13, IGHD6-19, IGHD6-25, IGHD7-27, IGHJ1, IGHJ2, IGHJ3, IGHJ4, IGHJ5, IGHJ6, IGHG1, IGHG2, IGHG3, IGHG4, and IGHGP.
 34. (canceled)
 35. The method of claim 32, wherein the one or more variables comprise usage frequencies of two or more genes in an IgG H chain variable region of a BCR in the subject, and/or wherein the one or more variables exhibit AUC ≥0.7 under a ROC curve in regression analysis for differentiating MS from ME/CFS, wherein the one or more variables exhibit AUC ≥0.8 under a ROC curve in regression analysis for differentiating MS from ME/CFS or wherein the one or more variables exhibit AUC ≥0.9 under a ROC curve in regression analysis for differentiating MS from ME/CFS. 36.-38. (canceled)
 39. The method of claim 32, wherein the one or more variables comprise usage frequencies of three or more genes in an IgG H chain variable region of a BCR in the subject, and/or wherein the one or more variables exhibit AUC ≥0.7 under a ROC curve in regression analysis for differentiating MS from ME/CFS, wherein the one or more variables exhibit AUC ≥0.8 under a ROC curve in regression analysis for differentiating MS from ME/CFS or wherein the one or more variables exhibit AUC ≥0.9 under a ROC curve in regression analysis for differentiating MS from ME/CFS. 40.-42. (canceled)
 43. The method of claim 31, wherein (b) is performed by the method of any one of claims 32, 33, 35 and
 39. 